kindle-your-highlights

It scrapes highlights from kinde.amazon.com web site (https://kindle.amazon.com/your_highlights).

Required Gems

nokogiri
jsonify
selenium-webdriver

Dependency

Using Firefox as default selenium engine. It may be able to specify other ones by passing option[:driver_type] in the constructor.

Usage

$ git clone git://github.com/parroty/kindle-your-highlights.git

$ cd kindle-your-highlights
$ bundle

$ export KINDLE_USERNAME="username"
$ export KINDLE_PASSWORD="password"

$ rake update:all

Rake Command Usage

default task is "rake update:recent"

rake convert
    call convert:all

rake convert:all
    load a local file and convert into xml/html format

rake convert:html
    load a local file and convert into html format

rake convert:xml
    load a local file and convert into xml format

rake open
    call open:html (TODO : mac only solution)

rake open:html
    open html file (TODO : mac only solution)

rake open:xml
    open xml file (TODO : mac only solution)

rake print
    load a local file and print highlight data

rake update
    call update:new

rake update:all
    retrieve all data from amazon server, and store them into a local file

rake update:new
    retrieve only newly arrived items from amazon server, and store them into a local file

rake update:recent
    retrieve recent 1 month data from amazon server, and store them into a local file

Library Usage Examples

object operation

require 'kindle-your-highlights'

# to create a new KindleYourHighlights object, give it your Amazon email address and password
kindle = KindleYourHighlights.new("[email protected]", "password")

kindle.highlights.each do |highlight|
	highlight.annotation_id      # => a unique value for each highlight, generated by Amazon
	highlight.content            # => the actual highlight text
	highlight.asin               # => the Amazon ASIN for the highlight's product
	highlight.author             # => author of the book from which the highlight is taken
	highlight.title              # => title of the book from which the highlight is taken
	highlight.location           # => highlight location in the book
	highlight.note               # => users' note added along with the highlight
end

kindle.books.each do |book|
	book.asin                    # => the Amazon ASIN for the book
	book.author                  # => author of the book
	book.title                   # => title of the book
	book.last_update             # => last update of the hightlights for the book (last annoted at)
end

xml/html outputs

require 'kindle-your-highlights'

# to create a new KindleYourHighlights object, give it your Amazon email address and password
kindle = KindleYourHighlights.new("[email protected]", "password", { :page_limit => 100, :day_limit => 31, :wait_time => 2 }) do | h |
	puts "loading... [#{h.books.last.title}] - #{h.books.last.last_update}"
end

# xml outputs (needs to create ./xml folder in advance)
KindleYourHighlights::XML.new(:list => kindle.list, :file_name => "xml/out.xml").output

# html outputs (needs to create ./html folder in advance)
KindleYourHighlights::HTML.new(:list => kindle.list, :file_name => "html/out.html").output

differential save/load

require 'kindle-your-highlights'

# to create a new KindleHighlight object, give it your Amazon email address and password
kindle = KindleYourHighlights.new("[email protected]", "password", { :page_limit => 100, :wait_time => 2 }) do | h |
	puts "loading... [#{h.books.last.title}]"
end

# load previous file, merge with the new one, and dump it again.
if File.exist?("out.dump")
	list = KindleYourHighlights::List.load("out.dump")
	kindle.merge!(list)
end

KindleYourHighlights::HTML.new(:list => kindle.list, :file_name => "out.html").output
kindle.list.dump("out.dump")

options

page_limit : specifies maximum number of pages (books) to be loaded
day_limit : specifies maximum number of days to be retrieved, based on "Last annotated on" date and today
stop_date : specifies the "Last annoted on" date to stop collecting more data.
wait_time : specifies wait time between each page load in seconds (default is 5 seconds)
block : call-back function which for each page load completion
driver_type : symbol to identify the selenium driver

Output Examples

xml

XML output example

<?xml version="1.0"?>
<books>
	<book>
		<asin>ASIN</asin>
		<title>TITLE</title>
		<author>AUTHOR</author>
		<highlights>
			<annotation_id>ANNOTATION_ID1</annotation_id>
			<content>CONTENT1</content>
		</highlights>
		<highlights>
			<annotation_id>ANNOTATION_ID2</annotation_id>
			<content>CONTENT2</content>
		</highlights>
	</book>
</books>

html

updates

0.3.0
Change engine from Mechanize to Selenium, as it stopped working due to some unknown reasons.
0.2.0
Adding client-side features for HTML output (searching, highlighting)
Change output directory in Rakefile (e.g. ../html -> output/html)
0.1.0
Initial upload

notes

This lib was originally from "https://github.com/speric/kindle-highlights", but I created a separate project for adding features and for changing code formats.

Name		Name	Last commit message	Last commit date
Latest commit History 86 Commits
lib		lib
spec		spec
.env.sample		.env.sample
.gitignore		.gitignore
Gemfile		Gemfile
Gemfile.lock		Gemfile.lock
MIT-LICENSE		MIT-LICENSE
README.markdown		README.markdown
Rakefile		Rakefile
kindle-your-highlights.gemspec		kindle-your-highlights.gemspec

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

kindle-your-highlights

Required Gems

Dependency

Usage

Rake Command Usage

Library Usage Examples

object operation

xml/html outputs

differential save/load

options

Output Examples

xml

html

updates

notes

About

Releases

Packages

Contributors 3

Languages

License

parroty/kindle-your-highlights

Folders and files

Latest commit

History

Repository files navigation

kindle-your-highlights

Required Gems

Dependency

Usage

Rake Command Usage

Library Usage Examples

object operation

xml/html outputs

differential save/load

options

Output Examples

xml

html

updates

notes

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Contributors 3

Languages

Packages