#require mypages
#title Is there such a thing as Responsible Data Collection?
#header top , Is there such a thing as Responsible Data Collection?

#quote It’s time the industry adopted the following principles of responsible data collection…

Mr Kaiser Fung from 'Principal Analytics Prep' has come up with
(((ref http://junkcharts.typepad.com/numbersruleyourworld/2018/03/7-principles-of-responsible-data-collection.html , 7 Principles of Responsible Data Collection)))
in the face of the Cambridge Analytica disaster.
Since his article produces a false 404 error when visited via
civil-rights preserving anonimization tools, I will cite all the
relevant bits. Of course the article that discusses limits on
data collection promotes itself no-opt (that's less than opt-out)
data collection by Google, comScore, Quantcast, disqus etc.

#section First-person not second- or third-person permission
#quote When you create a new Facebook account, you are asked if you’d like to upload a contact list. If you choose not to, Facebook will still have lots of suggested friends for you. How does Facebook know who you know? One source of data is your friends. If your friends agree to upload their contact lists to Facebook, and your name or email or phone number happens to be on those lists, then by a reverse lookup, Facebook knows who your friends are. Such predictions are highly accurate. By uploading their contact lists, your friends have shared your private data without asking your permission – worse, they have given Facebook permission by proxy to take your private data and profit from it. Permission by proxy is dishonest, and should be banned.

I put point 2 first since it's the one we agree upon. It's been
too long that it has been considered okay for citizen to trade
in data about their social neighborhood, friends and peers.
Even the (((ref https://youbroketheinternet.org/GDPR , GDPR)))
does not impede social data treason.

#section Opt Ins not Opt Outs
#quote Currently, for most websites and mobile apps, the default is maximum data collection. Users wanting privacy then figure out how to limit the amount or type of data collected about them. This is an example of opt-out. The default should instead be opt-in: no data collection unless instructed by users. When the default setting is opt-in, businesses have to win over the users’ trust, and so they will have a much stronger incentive to clarify and explain the benefits of the data collection. Say goodbye to the days of hand-waving claims, coercion and trickery.

As we noticed before, people are willing to opt-in on data that
actually belongs to their friends and peers. If we indeed manage
to make that illegal, then we are still facing the problem that
there is a strong imbalance of knowledge between the organization
gathering the data and the user who is supposed to predict which
apparently harmless information can later be used against them.
Just answering questions on your favorite foods and sports?
Didn't expect that data to end up at your health insurance
company? In the case of Cambridge Analytica those people thought
they were helping a university research project -- and then the
data ends up being used for the worst possible use: demolishing
democracy. <b>So I dare to question the entire notion that people
are able to discern which data collection is good for them,</b>
let alone how much it is able to tell about their peers.

Oh, another problem: What if companies simply claim they were
given permission? How will you prove them wrong?
(((ref https://mobile.twitter.com/lynXintl/status/980454940257673216 , Facebook has been caught doing just that))):

#quote Daily Mail, UK, 2018-03-26: How Facebook logs ALL your phone calls and texts - but the social media giant insists the function has always been ‘opt-in only’

#section Stop mis-direction
#quote I’d like to see strong regulation with heavy penalties for businesses that request permission from users for specific uses of their data but then fail to police their data analysts to curb abuses. For example, many websites collect mobile numbers from users, saying that the two-factor authentication is essential to protect their accounts. Once the phone numbers are stored in the database, there is no telling which data analysts will get a hold of the data. Most data analysts will utilize whatever data they can get their hands on. To prevent mis-direction of data, companies should have a data governance function.

Sounds like a better-than-nothing measure. It describes a symptom
of the deeper malady of digital data: you can't trace abuse 
because all evidence is just data… some log files at best.
It doesn't stand tall before a court, so justice doesn't happen
and anyone who doesn't abuse data is in a strategic disadvantage
to those who do. The market then deals with what's left of ethics.
In a globalized world where competition shapes companies much more
than laws or ethics, this is a losing game and allowing for
companies to have any such data in the first place is problematic.

Does that mean we shouldn't use digital technology at all?
No! Read on, the solution is at the end!

#section Sunshine Policy
#quote It is technically feasible for Facebook or other companies to keep a log of which third parties have received what data about you from Facebook. If these companies believe that the trading of private data is fundamental to their business models, then they should allow users to inspect how they collected the data, and which entities received the data. Better yet, users should be given the ability to opt out of specific transactions. For example, if Facebook has a deal to sell data to Pfizer, users should have the right to say no, you should not give our data to Pfizer.

So either Facebook has to overwhelm users with opt-in/opt-out
choices, or it makes use of the fact that abuse is nearly
impossible to prove and simply continues doing things
behind users' backs. In the current market situation where
companies are in worldwide competition, the ones that
disregard laws will win. It's mathematical. The solution
is to make laws they cannot disregard. Read on.

#section Wall off the data
#quote If companies are willing to wall off user data, and not send them to third parties, then users are more likely to share the data.

Again naive… how can they afford not to sell data on a market
where everybody does? How can you know if it is in their
interest to hide this fact and there is no physical evidence?

#section The right to be forgotten
#quote Europe is ahead of the U.S. on this issue. Companies should be required to delete user data older than say five years. Aggregate statistics older than five years should be allowed. More recent data supersede the older data, so there is negligible value in keeping the old data anyway.

This also builds on the wishful thinking that companies truly
delete data, ever. Why throw away money if they can take the 
old data home and sell it on the darknet? How do you expect
to catch anyone on the black data market for as long as
(((ref illegalblockchains , Bitcoin is legal))) and the
object offered for sale needs no postal delivery?

#section Stop the blackmail
#quote One reason for the pervasive data sleaze is the favorite business model of web and mobile companies – free service to all, paid for by advertisers. Users are then barred from using the service unless they sign off on extensive snooping. Sometimes, their signatures are not even required; the websites just claim that usage is taken to imply consent. This policy is about taking the cake and eating it too. The website operators don’t really want to ban any user so as to inflate their user counts (“eyeballs”). This practice creates the perception of dishonesty, and is self-defeating, if the companies actually believe that the data collection benefits their users. If the business model is such that users get free service in exchange for their private data, then they should enforce strict access policies, only serving those who acknowledge the data collection.

Same problem of naivité as the previous proposals. In a worldwide
competition, how likely is it to expect collaboration on this front?
If some companies comply, will they quickly be superceded by
competitors who don't?

#section Concluding…

I had hoped this article would offer an alternative to
(((ref https://www.technologyreview.com/s/611488/gary-reback-technologys-trustbuster/ , dismantling big data monopolists))),
but in my view it doesn't. The
collection of big data creates material that is highly valuable,
highly inviting for abuse and impossible to protect.

The kind of big data that makes sense to share with corporations is
the sort of data that has nothing to do with individuals or groups
of human beings. Frequently it makes sense to make it open data then.

For social data instead the solution should be to make distributed
systems that do not put personal data in the hands of strangers.
At all.

(((ref https://youbroketheinternet.org/programme , Here's how))).

#unterschrift
#index

Last Change: 2019-03-22

#repost responsibledata