The Truth About Data Collection And How “Personal” Your Information Really Is

The practice of data collection began thousands of years before Cambridge Analytica stole 87 million Facebook users personal information. Beginning in Ancient Uganda in 18,000 BCE, tribespeople used the “Ishango Bone” to record and store data.

Modern data collection is traced back to London in 1663. John Graunt ran the first known experiment in statistical data analysis. He tracked mortality information to help combat the bubonic plague. It was not until 1926 that modern data storage, the kind of storage that makes Cambridge Analytica so controversial, was invented. This is where our journey begins…

Nikola Tesla Theorizes The World Will Become A Brain

Nikola Tesla first theorized digital data collection.jpg

In 1926 Nikola Tesla made a prediction about the earth that few at the time fathomed could come true. In an interview with Colliers Magazine, the legendary inventor said, “the whole Earth will be converted into a huge brain…A man will be able to carry one [telephone] in his vest pocket.”

Tesla believed that this would be possible because he already considered the world a brain. All the particles surrounding us gather data and communicating together. Two years later Fritz Pfleumer discovered how to magnetically record data, the first step in the realization of Tesla’s prediction.

Magnetic Data Storage Comes To Life In 1928

the first magnetic tape recorder.jpg

Photo Credit: ullstein bild/ullstein bild via Getty Images

In 1928, German-Austrian engineer Fritz Pfleumer invented a way to record data on tape magnetically. The invention would lead to the creation of the tape recorder, the first of which debuted in 1935. This new system of recording data might seem irrelevant today, but trust us, it’ not.

While the methods of data storage have changed drastically since Pfleumer’s discovery in 1928, a number of principles he introduced are exactly the same. Digital data stored in 2018 is all done magnetically with computer hard drives.

Business Intelligence Booms With The Babies

ibm recording data in 1965.jpg

Photo Credit: Keystone-France/Gamma-Keystone via Getty Images

By 1958 businesses were ready to get into the data boom. Hans Luhn, working for IBM, believed that collecting data would guide businesses to achieve their monetary goals. The idea is similar to the concept used by social media companies today to drive advertising.

Seven years later the US Government got in on the action by planning the first data storage center. The center held 742 million tax records on magnetic tape. WIth room to grow, it added 175 million fingerprints. With these advances in data storage, a new era in information began.

Data Becomes Accessible To Everyone in 1970

data farming in 1972.jpg

Photo Credit: Alan Band/Fox Photos/Getty Images

A huge breakthrough in data collection occurred in 1979. Edgar F Codd introduced his “relational database” framework. The IBM employee created the idea to store information in a universally accessible hierarchical format.

Put simply, anyone who knows what they are looking for can find it. No technical expertise is needed. This format provided the spine that most data businesses use today. With the system for mass data collection in place, Big Data came to be, with the first official use of the term used in 1989.

Big Data Has Its Coming Out Party

tim berners-lee.jpg

Photo Credit: Amy E. Price/Getty Images for SXSW

More decades businesses have claimed the data collected is for the benefit of the consumer. In 1989 Erik Larson coined the term “Big Data” and challenged the claim, telling Harpers Bazaar, “data has a way of being used for purposes other than originally intended.”

Big Data took its next significant step in 1991 with the creation of the internet. Tim Berners-Lee laid out the parameters and specifics for a globally connected “web of data.” And just like the Codd’s “relational database” the internet would be accessible to anyone.

The Internet Becomes User Generated

an early social media website.jpg

Photo Credit: Leon Neal/AFP/Getty Images

For the next 15 years, the internet grows, and the amount of data stored expands exponentially. In 2005 a new breakthrough occurred; the “user-generated-web.” This new era turns the web to the user, who respond by generating massive amounts of data through social media.

Seemingly harmless to the user, Facebook had 5.5 million users by 2005. The data shared between this network of friends could be accessed by businesses to gain an advertising advantage. With the controversy surrounding Cambridge Analytica, the scope of the problem was revealed to be much larger. But we’ll get to that soon.

Mobile Computing Takes Over

steve jobs shows off the first iphone.jpg

Photo Credit: SHAUN CURRY/AFP/Getty Images

Smartphones had been around in some form since 1992 but didn’t explode until the release of the first Blackberry in 2002. Five years later LG released the Prada, the first keyboard-free touchscreen device. Shortly after Apple released the iPhone, taking the market by storm.

By 2014 mobile phones surpassed traditional desktop and laptop computers for digital data consumption. This dangerous development also transformed the way businesses could collect and track data, expanding each user’s uniquely targeted advertising experience to include location and voice monitoring. To use this data for financial gain, it must be mined.

How Data Mining Works

google search engine.jpg

Photo Credit: TORSTEN SILZ/AFP/Getty Images

Data Mining has been around since the 1990s. With the rise of the internet and mobile technology, the controversy surrounding it has only grown. The concept behind data mining is simple; a business takes the data it has collected, analyzes it, and uses the information to predict consumer trends and potential needs.

The Google search engine was developed by Larry Page and Sergey Brin to process this information quickly. Based on a users search history, Google generates results based on trend analytics, giving you the best matches as your initial results. As Cambridge Analytica proved, this data can be used to affect much more than a user’s experience.

Cambridge Analytica Gets Caught

a presentation for cambridge analytica.jpg

Photo Credit: Bryan Bedder/Getty Images for Concordia Summit

Cambridge Analytica is a British consulting firm that worked with Donald Trump’s Presidential campaign as well as Brexit. The company was banned from Facebook after it got caught mining the information of millions of site users to influence election results.

Obtaining user information from third-party app thisisyourdigitallife, Cambridge Analytica able to access the data of 270,000 users. The firm used this information to search the information of the app user’s friends. This breach in trust skyrocketed the total number of accounts mined to 87 million.

Separating The Legal From The Illegal

this is your digital life.jpg

Photo Credit: Jaap Arriens/NurPhoto via Getty Images

Before going any further, it is important to point out the 270,000 thisisyourdigitallife users permitted the collection of data. All third-party applications work this way when new users sign up. Aleksandr Kogan, who developed thisisyourdigitallife had no permission to allow another company to obtain his userbase’s information.

Cambridge Analytica then took the illegal step of going into users profiles and mining data from their friends. Nearly all of Facebook’s reported 87 million users affected never signed up for the app. The massive breach of trust has place Facebook under fire. The company has responded by changing and updating numerous policies.

Facebook Updates Privacy Policies

facebook privacy policy.jpg

Photo Credit: Richard Atrero de Guzman/NurPhoto via Getty Images

Things are changing at Facebook amidst the Cambridge Analytica scandal. The company, first and foremost, plans to ban all companies using authorized profiles to mine data from unauthorized ones. The policy has been in place since 2014, but Zuckerberg admits it has not been monitored appropriately.

Mark Zuckerberg also plans to limit third-party access to user information. Most critically, third-party apps will no longer have permissions to see guest lists for events. These actions will hopefully work to decrease illegal data mining through the website and gain back user trust.

Violated Users Will Be Informed

users want better privacy protection from facebook.jpg

Photo Credit: Justin Sullivan/Getty Images

Facebook plans to do more than change their policies to save their user base. The website is going to be transparent with anyone who had their data misused is anyway, starting those affected by thisisyourdigitallife.

Mark Zuckerberg has taken full responsibility for the mishandling of Cambridge Analytica. Determined to not make the same mistake twice, he plans to expand the bug bounty program, used to reward hackers who uncover policy loopholes companies take advantage of. As Equifax proved, it’s better to employ hackers than be hurt by them.

Equifax Is A Green Light For Identity Thieves

equifax testifying to senate.jpg

Photo Credit: Mark Wilson/Getty Images

Not to be outdone by Facebook, Equifax was the victim of the most extensive data mining scandal. In 2017, Equifax was hacked, putting over 140 million American consumers at risk for identity theft. The credit company keeps personal information of its customers including drivers license and social security numbers.

Many consumers might be at risk and not even know it. The company is used by businesses when credit checks are performed. The hack began in the middle of May and wasn’t discovered until June. That two to three-month window represented a significant opportunity for hackers to mine and abuse data.

Equifax Has Suffered Three Breaches Since 2015

Equifax Has Suffered Three Breaches Since 2015

Photo Credit: Mark Wilson/Getty Images

Counting 2017, Equifax has been the victim of three massive data breaches since 2015. That year identity thieves successfully hacked the company twice, stealing the valuable W-2 information of millions of consumers. The tax documents provide all the information the criminals needed.

The effects of the combined breaches are devastating. In total, Equifax handles data for more than 91 million businesses and 820 million consumers worldwide. One fraud analyst described the severity of the hacks as the highest possible concern.

Yahoo Has One Billion Accounts Hacked


Photo Credit: Justin Sullivan/Getty Images

Yahoo suffered a data hack in 2013 that put one billion user accounts at risk. The data stolen from the attack included birthdays, phone numbers, encrypted security questions, names, and encrypted passwords.

The company, who identified the 2013 attack after dealing with a similar attack in 2014, says they don’t know how many users were victimized. As a precaution, Yahoo invalidated all user security questions and forced them to reset their passwords. Critics of Yahoo blamed their lack of attention to security in recent years for making them an easy target.

Arby’s Data Mines To Keep You From Starving

arbys restaurant.jpg

Photo CRedit: Rick Diamond/Getty Images for Arby’s

Fast food giant Arby’s currently uses the slogan, “we have the meats!” The inspiration for the slogan may be a result of data mining. The company has found increasing use for data analytics in deciding what demographics to focus their advertising.

Through the use of data mining Arbys can directly see what advertisements are most effective. They also see what networks airing advertisements draw the most foot traffic their way. By increasing the efficiency of their advertising, Arbys has vastly improved their positioning in the ongoing fast food wars. Demonstrated by Arbys, data mining is not always bad. Here are a few ways it can be useful in society!

Educational Data Mining Is Growing Fast

young student learning.jpg

Photo Credit: Tim Boyle/Getty Images

A relatively new field, educational data mining works to develop methods learn student patterns to improve their schooling. By recording a student’s behavior, educational organizations can track patterns and predict trends. This, in theory, will help individualize every student’s education to increase their opportunity for success.

Data mining is also being used to predict assessment results for each student. Information collected this way will shape what educators are teaching and how. Considered controversial and an invasion of privacy, if done correctly educational data mining could lead to improving the failing school system of the United States.

Future Healthcare Concerns Can Be Predicted

patients waitig for the doctor.jpg

Photo Credit: Jerry Redfern/LightRocket via Getty Images

Like education, data mining is a relatively new endeavor in the field of healthcare. By using analytics, healthcare providers may be able to improve best practices and reduce patient costs.

One of the big advantages of healthcare analytics is how it will help hospitals adjust workforce needs to serve their patients better. Data will show hospitals and providers how many patients of what kind of illness need help daily. Using this information should reduce wait times and increase facility efficiency.

A Non-Traditional Fraud Detector

wall street.jpg

Photo Credit: Spencer Platt/Getty Images

Fraud has destroyed countless lives. Current fraud detection methods take time and resources. Data mining as a resource for fraud detection could protect millions of people’s investments. Being able to analyze large numbers of data in minutes would save millions of dollars.

Not a reality yet, activities being processed would need to be accurately labeled as fraudulent or not. Using patterns to determine this is controversial at best, although workable models are being designed. One proposed method includes supervising the collection of sample records. With this information, a workable model can be tested and implemented.

Criminal Investigations Solved Quickly

data mining is being used to improve police work.jpg

Photo Credit: Scott Olson/Getty Images

One field where data mining is used to help the greater good is criminal investigations. With extensive database available to be mined, data analysts just need to look for similar patterns and catch criminals.

Handwritten reports by cops can be transcribed to text and put into the system and used to perform crime matching probabilities. While data mining crime will not identify criminals, it does help identify traits and characteristics they share. This narrows the field for police officers in their search, while removing race or socio-economic bias.