-
Notifications
You must be signed in to change notification settings - Fork 4
WIP: Zenodo importer. #21
base: master
Are you sure you want to change the base?
Changes from 1 commit
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
| Original file line number | Diff line number | Diff line change |
|---|---|---|
|
|
@@ -13,6 +13,7 @@ class Tool < ActiveRecord::Base | |
| after_save :invalidate_cache | ||
| has_and_belongs_to_many :users | ||
| has_many :citations | ||
| has_many :tool_versions | ||
| validates_uniqueness_of :url | ||
| validates_presence_of :url | ||
| validates_presence_of :name | ||
|
|
@@ -54,8 +55,8 @@ def check_health | |
| when :github | ||
| contents = JSON.parse RestClient.get "https://api.github.com/repos/#{repo_name}/contents", | ||
| {:params => | ||
| {:client_id => ENV['ST_GITHUB_CLIENT_ID'], | ||
| 'client_secret' => ENV['ST_GITHUB_CLIENT_SECRET'] | ||
| {:client_id => ENV['GITHUB_CLIENT_ID'], | ||
| 'client_secret' => ENV['GITHUB_CLIENT_SECRET'] | ||
|
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Align the elements of a hash literal if they span more than one line. |
||
| } | ||
| } | ||
| path_key = 'name' | ||
|
|
||
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,3 @@ | ||
| class ToolVersion < ActiveRecord::Base | ||
| belongs_to :tool | ||
| end |
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,10 @@ | ||
| class CreateToolVersions < ActiveRecord::Migration | ||
| def change | ||
| create_table :tool_versions do |t| | ||
| t.integer :tool_id | ||
| t.string :url | ||
|
|
||
| t.timestamps null: false | ||
| end | ||
| end | ||
| end |
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,82 @@ | ||
| module Importer | ||
| class Zenodo | ||
| XMLNS = {xmlns: "http://datacite.org/schema/kernel-3"} | ||
|
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Space inside { missing. |
||
|
|
||
| # Long running | ||
| def self.import | ||
| new.import | ||
| end | ||
|
|
||
| def import | ||
| loop do | ||
| @response = open( | ||
| api_url, "User-Agent" => USER_AGENT | ||
| ).read | ||
|
|
||
| @response = Nokogiri::XML(@response) | ||
| @results = @response.css("record") | ||
| results = process_results | ||
| break if results.empty? | ||
| sleep 0.6 | ||
| end | ||
| end | ||
|
|
||
| def resumption_token | ||
| @response.css("resumptionToken").text if @response | ||
| end | ||
|
|
||
| def process_results | ||
| @results.map do |result| | ||
| process_result(result) | ||
| end | ||
| end | ||
|
|
||
| def process_result(result) | ||
| doi_css = "xmlns|identifier[identifierType='DOI']" | ||
| relation_css = "xmlns|relatedIdentifier[relationType='IsSupplementTo']" | ||
| cited_css = "xmlns|relatedIdentifier[relationType='IsCitedBy']" | ||
|
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Useless assignment to variable - |
||
|
|
||
| # GitHub | ||
| result.css(relation_css, XMLNS).each do |url| | ||
| if url.text =~ /github.com/ | ||
| process_github_url(url.text) | ||
| else | ||
| puts url | ||
|
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Do not write to stdout. Use Rails' logger if you want to log. |
||
| doi = result.css(doi_css, XMLNS).first | ||
|
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Useless assignment to variable - |
||
| puts doi | ||
|
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Do not write to stdout. Use Rails' logger if you want to log. |
||
| #process_other("http://dx.doi.org/#{doi.text}") | ||
|
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Missing space after #. |
||
| end | ||
| end | ||
|
|
||
| # Citations | ||
| result.css(cited_css, XMLNS).first.try(:tap) do |url| | ||
| puts url | ||
|
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Do not write to stdout. Use Rails' logger if you want to log. |
||
| end | ||
|
|
||
| result | ||
| end | ||
|
|
||
| def process_github_url(url) | ||
| url_parts = url.match(/.*github.com\/(.+?)\/(.+?)(\/|\z)/) | ||
| main_url = "https://github.com/#{url_parts[1]}/#{url_parts[2]}" | ||
| tool = Tool.where(url: main_url).first_or_create | ||
| puts tool | ||
|
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Do not write to stdout. Use Rails' logger if you want to log. |
||
| tool_version = ToolVersion.where(url: url, tool: tool).first_or_create | ||
|
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Useless assignment to variable - |
||
| puts tool_version | ||
|
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Do not write to stdout. Use Rails' logger if you want to log. |
||
| end | ||
|
|
||
| def process_other(url) | ||
| tool = Tool.where(url: url).first_or_create | ||
|
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Useless assignment to variable - |
||
| puts tool | ||
|
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Do not write to stdout. Use Rails' logger if you want to log. |
||
| end | ||
|
|
||
| def api_url | ||
| base_url = "https://zenodo.org/oai2d?verb=ListRecords" | ||
| if resumption_token.present? | ||
| "#{base_url}&resumptionToken=#{resumption_token}" | ||
| else | ||
| "#{base_url}&metadataPrefix=oai_datacite3&set=software" | ||
| end | ||
| end | ||
| end | ||
| end | ||
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,5 @@ | ||
| require 'rails_helper' | ||
|
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Prefer double-quoted strings unless you need single quotes to avoid extra backslashes for escaping. |
||
|
|
||
| RSpec.describe ToolVersion, :type => :model do | ||
|
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Use the new Ruby 1.9 hash syntax. |
||
| pending "add some examples to (or delete) #{__FILE__}" | ||
| end | ||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Space inside { missing.
Prefer double-quoted strings unless you need single quotes to avoid extra backslashes for escaping.