• Unity caught spying on developers using the editor
    68 replies, posted
That telemetry is impossible to gather without personally identifiable information. Any vector or means you might use that isn't the typical avenue becomes intensely questionable as to whether or not it's accurately sorting users into the right buckets. We could attempt to discriminate by average dropped packets or by ping-time but that's about it for information that could be considered 'non personally identifiable'. Some would actually argue it's worse because now you're showing that you're specifically obfuscating IPs but keeping the e-mails. That's personally identifiable information. Again, I don't think, personally, it matters so long as your data is displayed in the following manner to anyone who queries that database but some would contest that. b195307 | 1912ms max | 1 Restart(s) 9fd3b95c| 1882ms max | 0 Restart(s)
It wouldn't be that difficult to generate the identifier based upon boot time. Even if the entire populace of Earth was using Unity, the chances of two people having identical boot times are unlikely in the extreme. For example, in Javascript you can easily get a Unix timestamp, I don't have any experience with dates in stuff like C# or C++ but I'd assume it's pretty easy as well. "1532702124962" or its hex version "164DC2A6BA2" (assuming it's my boot time rather than the time I got the date in Firefox's console) is unique without any real identifying information. Since Unix timestamps are millisecond precision you'd have to have around a thousand users simultaneously boot up their computers for it to have a real risk of two users ending up with the same identifier. Actually that would be a matter of miscommunication. In your example I thought that with the way the emails were give they were already obfuscated. And I still disagree. There's no need for it to be personally identifiable to begin with so whether the people who can normally view the data can see that info or not is irrelevant. If those people can't see the data then what reason do they even have to collect it to begin with? The only purpose it would seem to serve is to collate the information, which my suggestion does viably, or some other invasive purpose.
That information you've called out only identifies users who don't reset their computers. That's a volatile assumption which will lead to unreliable data if we're trying to track the sorts of information we want here.
The point of tracking by boot time like that is to be able to track a user's session through restarting Unity but without personally identifying them. (Which would be a useful piece of telemetry if it's clear they're having an issue after restarting the program.) I'm not really certain how it would lead to unreliable data anyways. If anything it would make it easier to figure out if an issue is possibly related to system stability since systems tend to destabilize over time. And it still doesn't identify the user so I'm not sure where you're getting that. The only piece of remotely identifiable information in my suggestion is boot time which at best gives you a vague idea of what timezone they're in. If someone could personally identify you using your computer's boot time that'd be legitimately impressive.
If a user's in a situation wherein they require help/assistance, it's likely that they would be attempting to troubleshoot a problem -- which means a reset is not only likely it's to be expected. Of course it identifies the user. It doesn't tell you their name, or anything like that, but it does identify them. If something's personally identifiable, that means it can identify you or your data -- that's more than meaning just 'it gives out your name' or 'it gives out your information'. It means you're 'not (at least reasonably-speaking) anonymous'. Transient/volatile data is less identifiable to a person -- but if we're going to rule that IP addresses are identifiable to a person then lots and lots of transient data-types are in that same category of 'volatile -- but possibly identifiable'. Also, it's transient/volatile which by definition means it's not reliable for anything more than casual reference/interest.
IP addresses and emails are very much so personally identifiable information. If you want a permanent anonymous identifier then that's still possible. A lot of telemetry is tracked using GUIDs which are set once. The issue with those comes with the fact that it's still possible to passively track people like companies such as Google, Facebook, and Amazon do across the web. Also on this bit in particular: If you're in contact with tech support then you're very likely to be including in-depth logs to begin with so this argument is completely irrelevant.
People who're looking for help on their own are only potentially liable to get in contact with tech support. The vast majority of my issues in Unity I never file bug reports for or seek communication on.
Wouldent there be a huge amount of resources on your PC dissapearing into the aether it they had like screen capture or keylogging data being sent through? From what I gleaned through Twitter the email is a slightly modified version of the one they send out when you haven't logged on in over 6 months. It's definitely worded like they were actively watching him or at least they could see he had it open and was basically idling, Creepy. Well if they really wanna see my lewd furry vrc avatars I guess...
Unity responded - Apparently the email was not just because the person in question was idle: https://twitter.com/unity3d/status/1022885152354009088?s=19
Sorry, you need to Log In to post a reply to this thread.