If your reaction to the announced demise of Google Reader was to scream “But my starred items!”, then this is the tutorial for you. Read on as we show you multiple ways to extract all your starred articles from Google Reader.
Why Do I Want To Do This?
Google Reader is shutting down on July 1st. If you, like millions of RSS fans across the globe, were a Google Reader fan, there’s a good chance that you used the star function to flag articles to hold onto, to read later, or for some other purpose.
If you would like to rest assured that all those starred articles are safe and sound despite the impending implosion of Google Reader, you’ll need to perform a few minor steps to ensure you have the data in your possession and not left to rot on the Google servers.
When you’re done with following the tutorial, you’ll have (at minimum) a file that contains all your starred items and (depending on which segment of the tutorial you decide to follow along with) your starred items in a more user-friendly format.
There is one thing no bit of exporting or automation magic can help with, however, and that’s actually processing the content of the starred articles. If you’ve been starring articles to read later for years now you’re probably going to be shocked at how many exported articles this process generates. You may just have to set aside a little time each day for a few weeks to dig through the resulting dump bit by bit.
Exporting Your Google Reader Data with Google Takeout
The very first order of business is to simply get a copy of all your Google Reader data directly in your possession. This way, no matter what happens to your Reader data on Google’s servers in the future, you’ll have a copy of it to work with.
Google Takeout is a great tool to extract your data from all sorts of Google services, but we’re only interested in Reader for this tutorial. Visit the Reader subsection of the Google Takeout tool here. It will take a moment to calculate the size of the Takeout file. Once it finishes, click Create Archive.
Despite the fact that it’s not exporting your entire Google account but just a little portion of it, the process takes a surprisingly long time. We would recommend checking “Email me when ready” and going to grab a cup of coffee.
When it’s all done, click on the Download button that appears in the lower right hand corner.
Go ahead and extract the archive to a working directory, such as My Documents, and put the archive itself in a safe place. The archive files are arranged as such:
There are two file types in the archive: JSON and XML. JSON (JavaScript Object Notation) files are simply a type of data interchange format and XML (Extensible Markup Language) files are a handy way to markup a document so that it is both machine and human readable. The file we’re most interested in for this tutorial is the starred.json file, as it contains all the entries for your starred items.
Of equal importance in the grand scheme of freeing your data from Google Reader and moving onto greener pastures, however, is the subscriptions.xml file. This file contains all your RSS subscriptions and, should you desire to import all your old subscriptions from Google Reader into a new RSS application, this is the file you will use to do so. Definitely keep it (and the original archive you downloaded from Google Takeout) in a safe place.
Converting the Starred Items to Bookmarks
One of the easiest ways to deal with the JSON file is to use JSONview (an extension available for both Firefox and Chrome). This method is best suited for readers with a small number of starred items in Google Reader (less than 1,000).
Install the extension for your respective browser and then simply drag and drop the starred.json file onto a new browser pane. Save the resulting file as an HTML document. You can then turn right around and import the HTML document into your web browser of choice and it will import all the links as new bookmarks.
There are two downsides to this technique, however. The first is that you’ll end up with some duplicate URLs in your bookmark file as the domain/main source URL of articles you’ve frequently starred (like say, articles from How-To Geek) will appear multiple times. That’s a little annoying, but not that big of a deal.
The second downside is a deal breaker for people with a lot of starred items (those of us with thousands and thousands of starred items); when dealing with a really enormous HTML import, most of the time it just craps out and never finishes. Obviously this is a highly unsatisfactory solution for Reader power users, as it never finishes importing your starred items. If you’re a power user and you have thousands of starred items to deal with, importing them as bookmarks just isn’t going to cut it.
Converting the Starred Items to Individual Links (and Importing to Evernote)
For the kind of heavy processing power users need (the kind of processing that can cut through 5,000+ starred items in minutes), we’re turning to Python to help us crunch through our massive list.
Courtesy of Paul Kerchen and Davide Della Casa, two Google Reader power users that wanted to export all their old starred items, we have two very handy Python scripts that can help us do one of two things: 1) convert all the starred item entries into distinct HTML documents and/or 2) import all of our starred items into Evernote.
For both tricks, you’ll need to have Python installed on your system. Grab a copy of Python for your operating system and install it before proceeding.
After installing Python, visit the site for Kerchen/Casa’s Google Reader Export project and save the export2HTMLFiles.py and export2enex.py files to the same folder to which you extracted your starred.json file.
If you wish to convert all your starred items into distinct HTML files, you can do so using the export2HTMLFiles.py by executing the following command within the directory where your starred.json file is stored:
(If python is not designated as a systemwide command on your machine, replace “python” with the full path to the python executable, e.g. C:\Python2.7\python.exe)
Depending on the number of starred items you have, this process can take anywhere from a few seconds to several minutes. It took around three minutes to rip through 12,000 starred items during our test.
When it is done, you’ll have a series of numbered and named HTML files (e.g. 1 some article you starred.html to 10000 some other article you starred.html). The easiest way to look at them all is to simply load the local directory in your web browser.
This is a great way to free your starred items from Google Reader and the JSON file, but as we mentioned earlier in the tutorial, if you’ve been saving articles to read them later for years now, you’ll have a monumental task on your hands.
One way you can better manage this task is to use Evernote as a workspace to sort, tag, and potentially delete no longer useful starred items.
There are two ways you can go about importing the items into Evernote. You can import the HTML files we created a moment ago by using the Import Folder. Within your Evernote desktop client you can go to Tools -> Import Folders and then create a dump folder for the HTML files. We made a sub-folder in the /Reader/ work folder called Imports and a new notebook in Evernote called Starred Items. By dragging and dropping the HTML files into the /Reader/Imports/ folder we are able to import them as distinct notes in the Evernote folder Starred Items. They’re permanently stored there to be reviewed at our leisure.
Alternatively, if you would like to convert all your starred items into a native Evernote notebook in one swoop, you can use the second Python script you downloaded, export2enex.py to do so. The advantage to doing so is that it does a slightly better job preserving the formatting of the documents.
Within the folder where your starred.json file is located, execute the following command:
Take the resulting file StarredImport.enex and import it into your Evernote desktop client using File -> Import -> Evernote Export Files.
At this point, you’ve liberated your starred items, in totality, from Google Reader and you’re ready to get down to the (potentially lengthy) business of sorting through the pile.
Have a clever way to manipulate the JSON file and extract the starred items? Join the discussion below and share your tips and tricks with your fellow readers.