The latest desktop version of Adobe’s digital reading software, Adobe Digital Editions 4, appears to be collecting a great deal of data about what its users are reading and uploading that data, unencrypted, back to Adobe’s servers.
The breach was first reported by Nate Hoffelder at the Digital Reader on Monday night. Following a tip from a hacker, he used the network tracking app Wireshark and discovered that [company]Adobe[/company] Digital Editions 4 was “gathering data on the ebooks that have been opened, which pages were read, and in what order. All of this data, including the title, publisher, and other metadata for the book is being sent to Adobe’s server in clear text.”
Adobe Digital Editions lets users read DRM-protected EPUB files that are borrowed from libraries or purchased from certain online bookstores. The desktop software, which is what appears to be affected, can be used to read ebooks without a separate device like an e-reader or tablet. It can also be used to transfer ebooks between devices — to sideload an ebook from a computer onto your e-reader, for instance.
Ars Technica and Safari Books’ Liza Daly were able to replicate the issue, with Daly noting that data was being transmitted even for non-DRM’d ebooks. Hoffelder also said that Adobe Digital Editions 4 was transmitting data about ebooks stored elsewhere on his computer — “not just ebooks I opened in DE4, but also ebooks I store in calibre and every epub ebook I happen to have sitting on my hard disk” — though so far I haven’t seen anybody else confirm that.
Older versions of the Adobe Digital Editions software don’t appear to have this problem.
In a statement issued late Tuesday afternoon, Adobe said that “all information collected from the user is collected solely for purposes such as license validation and to facilitate the implementation of different licensing models by publishers. Additionally, this information is solely collected for the eBook currently being read by the user and not for any other eBook in the user’s library or read/available in any other reader.” (See the full statement at the end of this post.) Regarding the unencrypted data, the company told Ars Technica that “in terms of the transmission of the data collected, Adobe is in the process of working on an update to address this issue. We will notify you when a date for this update has been determined.”
It’s unclear how many people this issue is going to affect, but it seems right now that the only people who are susceptible to it are those who read or transfer ebooks on their desktops using the Adobe Digital Editions 4 software. If you buy an ebook from one of the vendors that supports EPUB — [company]Barnes & Noble[/company], [company]Kobo[/company], [company]Google[/company] — and then read it it within that vendor’s app or send it wirelessly to an e-reader, you shouldn’t be affected. Amazon Kindle users don’t have to worry about this because Amazon uses proprietary DRM and doesn’t support EPUB.
That is not to say that Amazon, Google, Kobo and Barnes & Noble don’t collect users’ e-reading data. They definitely do. There just hasn’t been a problem with that data being sent back unencrypted, as far as we know.
Regarding Adobe, libraries may have particular reason for concern, depending on the digital library distributor they’ve teamed up with and the method that distributor uses to help patrons download ebooks. Overdrive, the leading distributor, tweeted Tuesday:
It’s worth noting this doesn’t appear to be a security problem, specifically, with Adobe’s DRM, which is separate from its Digital Editions software (this Boing Boing headline not withstanding). But if the problem becomes publicized enough, it could turn into a major headache for Adobe (deservedly) and also for libraries. Adobe’s technology is the default for DRM (as the market leader, it’s also very expensive). Libraries would need to find an alternative pretty quickly to avoid being besieged with questions both from patrons who fear their ebooks are spying on them and from publishers who are already concerned about ebook library lending.
Here is Adobe’s full statement:
For more background:
For example, Adobe Digital Editions collects the following information:
· User ID: The user ID is collected to authenticate the user.
· Device ID: The device ID is collected for digital right management (DRM) purposes since publishers typically restrict the number of devices an eBook or digital publication can be read on.
· Certified App ID: The Certified App ID is collected as part of the DRM workflow to ensure that only certified apps can render a book, reducing DRM hacks and compromised DRM implementations.
· Device IP: The device IP is collected to determine the broad geo-location, since publishers have different pricing models in place depending on the location of the reader purchasing a given eBook or digital publication.
· Duration for Which the Book was Read: This information is collected to facilitate limited or metered pricing models where publishers or distributors charge readers based on the duration a book is read. For example, a reader may borrow a book for a period of 30 days. While some publishers/distributers charge for 30-days from the date of the download, others follow a metered pricing model and charge for the actual time the book is read.
· Percentage of the Book Read: This information is collected to allow publishers to implement subscription models where they can charge based on the percentage of the book read. For example, some publishers charge only a percentage of the full price if only a certain percentage of the book is read.
· Additionally, the following data is provided by the publisher as part of the actual license and DRM for the eBook:
o Date of Purchase/Download
o Distributor ID and Adobe Content Server Operator URL
o Metadata of the Book provided by Publisher (including title, author, publisher list price, ISBN number)
This story was updated Tuesday afternoon with comments from Adobe and with some additional information about other retailers.