Become a MacRumors Supporter for $50/year with no ads, ability to filter front page stories, and private forums.
As some background, one of the major problems dealing with unstructured data is extracting meaningful information out of it.

As an example, let's use a PDF file of a newspaper article. First, you need to extract the text out of it. Then, you have to figure out what the entities are, and entities generally are people, places, and things. Then you need to deal with time relevance.

To do most of those things you need to train your NLP engine to recognize entities. If you train them with too little data they'll get too specialized. If you train them with too much data they get too generalized.

What is an entity? It could be a name. What's a name? Sometimes it's firstname, lastname. Sometimes it's lastname, firstname. Both of those are valid representations of a canonical name. The trick is to recognize that both of those are really referring to the same thing.

When two entities have the same referent (not sure what the technical term is), then the engine needs to find more context. Is the Saddam Hussein you're talking about the former dictator of Iraq or the guy who owns the Shawarma place down the block?

Entity stuff is still a major problem in the academic world, and there's still a lot of stuff to figure out. All the solutions today are pretty clunky. In real life it's faster to just get a mechanical turk to do entity-ization for you, because people are still way faster and more accurate than computers.

I suppose if this approach works it'd be good, because it would allow you to actually query for most of the stuff that's out there in web pages, etc and return "an answer."
 



Apple recently paid around $200 million to acquire Lattice Data, a firm that aims to turn unstructured "dark data" such as text and images into structured data that can then be handled with traditional data analysis tools. News of the acquisition comes from TechCrunch, and Apple has essentially confirmed the acquisition by issuing its standard statement on the topic.

lattice_data.jpg

Lattice uses machine learning techniques to take mass amounts of initially unusable data and turn it into properly labeled and categorized data that can be used for AI, medical research, and more.TechCrunch says the deal closed "a couple of weeks ago," with roughly 20 Lattice engineers having joined Apple.

Article Link: Apple Acquires 'Dark Data' Machine Learning Company Lattice Data
[doublepost=1494776541][/doublepost]Jeff Hawkings of Palm Pilot fame still hasn't pulled Numenta out of the ditch and into the spotlight? Wow.
 
I dont like Apple collecting so much data, creepy.

Edit: Not because i dont trust Apple, it is cause i don't trust the US Government. One court order and your data belogs to the Government.
 
I dont like Apple collecting so much data, creepy.

Edit: Not because i dont trust Apple, it is cause i don't trust the US Government. One court order and your data belogs to the Government.
Do you think they are the only ones scraping the web?
 
  • Like
Reactions: TechGeek76
I feel like I just watched an advertisement for "snake oil"....

If the conveyance of something's usefulness cannot be explained in such a way that is easily understood, then 1) it is either BS, or 2) the only people who DO understand it, were not involved in the attempt to explain it.

There's loads of advertising that assumes a certain amount of knowledge on the part of the target audience. So long as the ad is intelligible to that audience, all is well. For example, most people in the market for a car understand "anti-lock brakes" without needing an explanation of their function and purpose.

Those outside of the target audience may scratch their heads, but that's alright. Considering that "ad" is a web page, it just has to be meaningful to people searching on terms like "dark data."

As the replies in this thread show, there are certainly those who immediately understood that meaning, most likely because they're involved in data mining in one form or another. There are plenty who are, on a small scale or large. Those who aren't are outside the target audience.
 
Yep, no info but a verbose "guess". Thanks for nothing.
Uh, a verbose guess seems about right for a rumor site, no?

But as long as we're thanking each other for nothing, thanks for not answering my question about what "advertisement" you watched but taking the time to criticize my response to someone else.
 
After reading the headline, I asked myself:

How do we, as humans, turn unstructured data into "structures", into meaning? Even understanding involuntary gestures to derive meaning (such as involuntary facial expressions, body language, guttural reactions...)

It appears to be all on our neurons -- past experiences, rote learning, memories, and "reasoning", the ability to apply logic and extract meaning from zero-day events, which were never experienced or learned before.

That zero-day part is the hard part for machines, because then they will "reason". And once they can, they become like us.

Just thinking aloud.
 
just out of curiosity, is there a "master list" somewhere of all the companies that Apple has acquired over the last say, 10 years? I've been reading MacRumors for the better part of 6 years and they've bought sooooo many companies.
 
Register on MacRumors! This sidebar will go away, and you'll see fewer ads.