diff options
-rw-r--r-- | extra/bulk_edits/2022-07-12_jalc.md | 47 | ||||
-rw-r--r-- | extra/bulk_edits/2022-07-12_orcid.md | 64 | ||||
-rw-r--r-- | extra/bulk_edits/2022-07-19_doaj.md | 78 | ||||
-rw-r--r-- | extra/bulk_edits/CHANGELOG.md | 15 | ||||
-rw-r--r-- | extra/cleanups/container_publisher_type.md | 100 | ||||
-rw-r--r-- | extra/stats/2022-07-14-prod-stats.json | 1 | ||||
-rw-r--r-- | extra/stats/2022-07-14-prod-table-sizes.txt | 47 | ||||
-rw-r--r-- | notes/merge_releases_examples.txt | 3 | ||||
-rw-r--r-- | python/fatcat_tools/importers/chocula.py | 3 | ||||
-rw-r--r-- | python/fatcat_tools/importers/doaj_article.py | 4 | ||||
-rw-r--r-- | python/tests/files/ISSN-to-ISSN-L.snip.txt | 1 | ||||
-rw-r--r-- | python/tests/files/example_doaj_articles.json | 10 | ||||
-rw-r--r-- | python/tests/import_doaj.py | 3 |
13 files changed, 369 insertions, 7 deletions
diff --git a/extra/bulk_edits/2022-07-12_jalc.md b/extra/bulk_edits/2022-07-12_jalc.md new file mode 100644 index 00000000..d9f09fee --- /dev/null +++ b/extra/bulk_edits/2022-07-12_jalc.md @@ -0,0 +1,47 @@ + +Import of a 2022-04 JALC DOI metadata snapshot. + +Note that we had downloaded a prior 2021-04 snapshot, but don't seem to have +ever imported it. + +## Download and Archive + +URL for bulk snapshot is available at the bottom of this page: <https://form.jst.go.jp/enquetes/jalcmetadatadl_1703> + +More info: <http://japanlinkcenter.org/top/service/service_data.html> + + wget 'https://japanlinkcenter.org/lod/JALC-LOD-20220401.gz?jalcmetadatadl_1703' + wget 'http://japanlinkcenter.org/top/doc/JaLC_LOD_format.pdf' + wget 'http://japanlinkcenter.org/top/doc/JaLC_LOD_sample.pdf' + + mv 'JALC-LOD-20220401.gz?jalcmetadatadl_1703' JALC-LOD-20220401.gz + + ia upload jalc-bulk-metadata-2022-04 -m collection:ia_biblio_metadata jalc_logo.png JALC-LOD-20220401.gz JaLC_LOD_format.pdf JaLC_LOD_sample.pdf + +## Import + +As of 2022-07-19, 6,502,202 release hits for `doi_registrar:jalc`. + +Re-download the file: + + cd /srv/fatcat/datasets + wget 'https://archive.org/download/jalc-bulk-metadata-2022-04/JALC-LOD-20220401.gz' + gunzip JALC-LOD-20220401.gz + cd /srv/fatcat/src/python + + wc -l /srv/fatcat/datasets/JALC-LOD-20220401 + 9525225 + +Start with some samples: + + export FATCAT_AUTH_WORKER_JALC=[...] + shuf -n100 /srv/fatcat/datasets/JALC-LOD-20220401 | ./fatcat_import.py --batch-size 100 jalc - /srv/fatcat/datasets/ISSN-to-ISSN-L.txt + # Counter({'total': 100, 'exists': 89, 'insert': 11, 'skip': 0, 'update': 0}) + +Full import (single threaded): + + cat /srv/fatcat/datasets/JALC-LOD-20220401 | pv -l | ./fatcat_import.py --batch-size 100 jalc - /srv/fatcat/datasets/ISSN-to-ISSN-L.txt + # 9.53M 22:26:06 [ 117 /s] + # Counter({'total': 9510096, 'exists': 8589731, 'insert': 915032, 'skip': 5333, 'inserted.container': 119, 'update': 0}) + +Wow, almost a million new releases! 7,417,245 results for `doi_registrar:jalc`. diff --git a/extra/bulk_edits/2022-07-12_orcid.md b/extra/bulk_edits/2022-07-12_orcid.md new file mode 100644 index 00000000..760a16c8 --- /dev/null +++ b/extra/bulk_edits/2022-07-12_orcid.md @@ -0,0 +1,64 @@ + +Annual ORCID import, using 2021 public data file. Didn't do this last year, so +a catch-up, and will need to do another update later in 2022 (presumably in +November/December). + +Not sure how many records this year. Current count on the orcid.org website is +over 14 million ORCIDs, in July 2022. + +Files download from: + +- <https://info.orcid.org/orcids-2021-public-data-file-is-now-available> +- <https://orcid.figshare.com/articles/dataset/ORCID_Public_Data_File_2021/16750535> +- <https://archive.org/details/orcid-dump-2021> + +## Prep + + ia upload orcid-dump-2021 -m collection:ia_biblio_metadata ORCID_2021_10_* orcid-logo.png + + wget https://github.com/ORCID/orcid-conversion-lib/raw/master/target/orcid-conversion-lib-3.0.7-full.jar + + java -jar orcid-conversion-lib-3.0.7-full.jar --tarball -i ORCID_2021_10_summaries.tar.gz -v v3_0 -o ORCID_2021_10_summaries_json.tar.gz + + tar xvf ORCID_2021_10_summaries_json.tar.gz + + fd .json ORCID_2021_10_summaries/ | parallel cat {} | jq . -c | pv -l | gzip > ORCID_2021_10_summaries.json.gz + # 12.6M 27:59:25 [ 125 /s] + + zcat ORCID_2021_10_summaries.json.gz | shuf -n10000 | gzip > ORCID_2021_10_summaries.sample_10k.json.gz + + ia upload orcid-dump-2021 ORCID_2021_10_summaries.json.gz ORCID_2021_10_summaries.sample_10k.json.gz + +## Import + +Fetch to prod machine: + + wget https://archive.org/download/orcid-dump-2021/ORCID_2021_10_summaries.json.gz + wget https://archive.org/download/orcid-dump-2021/ORCID_2021_10_summaries.sample_10k.json.gz + +Sample: + + export FATCAT_AUTH_WORKER_ORCID=[...] + zcat /srv/fatcat/datasets/ORCID_2021_10_summaries.sample_10k.json.gz | ./fatcat_import.py orcid - + # in 2020: Counter({'total': 10000, 'exists': 7356, 'insert': 2465, 'skip': 179, 'update': 0}) + # this time: Counter({'total': 10000, 'exists': 7577, 'insert': 2191, 'skip': 232, 'update': 0}) + +Bulk import: + + export FATCAT_AUTH_WORKER_ORCID=[...] + time zcat /srv/fatcat/datasets/ORCID_2021_10_summaries.json.gz | pv -l | parallel -j8 --round-robin --pipe ./fatcat_import.py orcid - + 12.6M 1:24:04 [2.51k/s] + Counter({'total': 1574111, 'exists': 1185437, 'insert': 347039, 'skip': 41635, 'update': 0}) + Counter({'total': 1583157, 'exists': 1193341, 'insert': 348187, 'skip': 41629, 'update': 0}) + Counter({'total': 1584441, 'exists': 1193385, 'insert': 349424, 'skip': 41632, 'update': 0}) + Counter({'total': 1575971, 'exists': 1187270, 'insert': 347190, 'skip': 41511, 'update': 0}) + Counter({'total': 1577323, 'exists': 1188892, 'insert': 346759, 'skip': 41672, 'update': 0}) + Counter({'total': 1586719, 'exists': 1195610, 'insert': 349115, 'skip': 41994, 'update': 0}) + Counter({'total': 1578484, 'exists': 1189423, 'insert': 347276, 'skip': 41785, 'update': 0}) + Counter({'total': 1578728, 'exists': 1190316, 'insert': 346445, 'skip': 41967, 'update': 0}) + + real 84m5.297s + user 436m26.428s + sys 41m36.959s + +Roughly 2.7 million new ORCIDs, great! diff --git a/extra/bulk_edits/2022-07-19_doaj.md b/extra/bulk_edits/2022-07-19_doaj.md new file mode 100644 index 00000000..d25f2dda --- /dev/null +++ b/extra/bulk_edits/2022-07-19_doaj.md @@ -0,0 +1,78 @@ + +Doing a batch import of DOAJ articles. Will need to do another one of these +soon after setting up daily (OAI-PMH feed) ingest. + +## Prep + + wget https://doaj.org/csv + wget https://doaj.org/public-data-dump/journal + wget https://doaj.org/public-data-dump/article + + mv csv journalcsv__doaj_20220719_2135_utf8.csv + mv journal doaj_journal_data_2022-07-19.tar.gz + mv article doaj_article_data_2022-07-19.tar.gz + + ia upload doaj_data_2022-07-19 -m collection:ia_biblio_metadata ../logo_cropped.jpg journalcsv__doaj_20220719_2135_utf8.csv doaj_journal_data_2022-07-19.tar.gz doaj_article_data_2022-07-19.tar.gz + + tar xvf doaj_journal_data_2022-07-19.tar.gz + cat doaj_journal_data_*/journal_batch_*.json | jq .[] -c | pv -l | gzip > doaj_journal_data_2022-07-19_all.json.gz + + tar xvf doaj_article_data_2022-07-19.tar.gz + cat doaj_article_data_*/article_batch*.json | jq .[] -c | pv -l | gzip > doaj_article_data_2022-07-19_all.json.gz + + ia upload doaj_data_2022-07-19 doaj_journal_data_2022-07-19_all.json.gz doaj_article_data_2022-07-19_all.json.gz + +On fatcat machine: + + cd /srv/fatcat/datasets + wget https://archive.org/download/doaj_data_2022-07-19/doaj_article_data_2022-07-19_all.json.gz + +## Prod Article Import + + git rev: 582495f66e5e08b6e257360097807711e53008d4 + (includes DOAJ container-id required patch) + + date: Tue Jul 19 22:46:42 UTC 2022 + + `doaj_id:*`: 1,335,195 hits + +Start with sample: + + zcat /srv/fatcat/datasets/doaj_article_data_2022-07-19_all.json.gz | shuf -n1000 > /srv/fatcat/datasets/doaj_article_data_2022-07-19_sample.json + + export FATCAT_AUTH_WORKER_DOAJ=[...] + cat /srv/fatcat/datasets/doaj_article_data_2022-07-19_sample.json | pv -l | ./fatcat_import.py doaj-article --issn-map-file /srv/fatcat/datasets/ISSN-to-ISSN-L.txt - + # Counter({'total': 1000, 'exists': 895, 'exists-fuzzy': 93, 'insert': 9, 'skip': 3, 'skip-no-container': 3, 'update': 0}) + +Pretty few imports. + +Full ingest: + + export FATCAT_AUTH_WORKER_DOAJ=[...] + zcat /srv/fatcat/datasets/doaj_article_data_2022-07-19_all.json.gz | pv -l | parallel -j6 --round-robin --pipe ./fatcat_import.py doaj-article --issn-map-file /srv/fatcat/datasets/ISSN-to-ISSN-L.txt - + # Counter({'total': 1282908, 'exists': 1145439, 'exists-fuzzy': 117120, 'insert': 16357, 'skip': 3831, 'skip-no-container': 2641, 'skip-title': 1190, 'skip-doaj-id-mismatch': 161, 'update': 0}) + +Times 6x, around 100k releases added. + +Got a bunch of: + + /1/srv/fatcat/src/python/fatcat_tools/importers/doaj_article.py:233: UserWarning: unexpected DOAJ ext_id match after lookup failed doaj=fcdb7a7a9729403d8d99a21f6970dd1d ident=wesvmjwihvblzayfmrvvgr4ulm + warnings.warn(warn_str) + /1/srv/fatcat/src/python/fatcat_tools/importers/doaj_article.py:233: UserWarning: unexpected DOAJ ext_id match after lookup failed doaj=1455dfe24583480883dbbb293a4bc0c6 ident=lfw57esesjbotms3grvvods5dq + warnings.warn(warn_str) + /1/srv/fatcat/src/python/fatcat_tools/importers/doaj_article.py:233: UserWarning: unexpected DOAJ ext_id match after lookup failed doaj=88fa65a33c8e484091fc76f4cda59c25 ident=22abqt5qe5e7ngjd5fkyvzyc4q + warnings.warn(warn_str) + /1/srv/fatcat/src/python/fatcat_tools/importers/doaj_article.py:233: UserWarning: unexpected DOAJ ext_id match after lookup failed doaj=eb7b03dc3dc340cea36891a68a50cce7 ident=ljedohlfyzdkxebgpcswjtd77q + warnings.warn(warn_str) + /1/srv/fatcat/src/python/fatcat_tools/importers/doaj_article.py:233: UserWarning: unexpected DOAJ ext_id match after lookup failed doaj=519617147ce248ea88d45ab098342153 ident=a63bqkttrbhyxavfr7li2w2xf4 + +Should investigate! + +Also, noticed that DOAJ importer is hitting `api.fatcat.wiki`, not the public +API endpoint. Guessing this is via fuzzycat. + +1,434,266 results for `doaj_id:*`. + +Then did a follow-up sandcrawler ingest, see notes in that repository. Note +that newer ingest can crawl doaj.org, bypassing the sandcrawler SQL load, but +the direct crawling is probably still faster. diff --git a/extra/bulk_edits/CHANGELOG.md b/extra/bulk_edits/CHANGELOG.md index f7b9e536..3c7be454 100644 --- a/extra/bulk_edits/CHANGELOG.md +++ b/extra/bulk_edits/CHANGELOG.md @@ -16,6 +16,21 @@ Ran a journal-level metadata update, using chocula. Cleaned up just under 500 releases with missing `container_id` from an older DOAJ article import. +Imported roughly 100k releases from DOAJ, new since 2022-04. + +Imported roughly 2.7 million new ORCiD `creator` entities, using the 2021 dump +(first update since 2020 dump). + +Imported almost 1 million new DOI release entities from JALC, first update in +more than a year. + +Imported at least 400 new dblp containers, and an unknown number of new dblp +release entities. + +Cleaned up about a thousand containers with incorrect `publisher_type`, based +on current publisher name. Further updates will populate after the next chocula +import. + ## 2022-04 diff --git a/extra/cleanups/container_publisher_type.md b/extra/cleanups/container_publisher_type.md new file mode 100644 index 00000000..dba800d3 --- /dev/null +++ b/extra/cleanups/container_publisher_type.md @@ -0,0 +1,100 @@ + +A bunch of MDPI journals are incorrectly listed as 'longtail'. + + fatcat-cli search container 'publisher:mdpi publisher_type:* !publisher_type:oa' --count + # 245 + +Because this is 'extra' metadata, need a little python script to change the +metadata (fatcat-cli doesn't have this feature yet): + + import sys + import json + + publisher_type = sys.argv[1].strip().lower() + #print(publisher_type, file=sys.stderr) + + for line in sys.stdin: + if not line.strip(): + continue + container = json.loads(line) + container["extra"]["publisher_type"] = publisher_type + print(json.dumps(container)) + +Run some cleanups: + + export FATCAT_AUTH_WORKER_CLEANUP=[...] + export FATCAT_API_AUTH_TOKEN=$FATCAT_AUTH_WORKER_CLEANUP + + fatcat-cli search container 'publisher:mdpi publisher_type:* !publisher_type:oa' --entity-json --limit 50 \ + | jq 'select(.publisher_type != "oa")' -c \ + | python3 ./container_publisher_type.py oa \ + | fatcat-cli batch update container --description "Update container publisher_type" + # editgroup_oum6mnkl2rbn3jaua4a2gdlj5q + +Looks good, run the rest: + + fatcat-cli search container 'publisher:mdpi publisher_type:* !publisher_type:oa' --entity-json --limit 300 \ + | jq 'select(.publisher_type != "oa")' -c \ + | python3 ./container_publisher_type.py oa \ + | fatcat-cli batch update container --description "Update container publisher_type" --auto-accept + +Some more cleanup patterns: + + fatcat-cli search container 'publisher:"Frontiers Media SA" publisher_type:* !publisher_type:oa' --count + # 84 + + fatcat-cli search container 'publisher:"Frontiers Media SA" publisher_type:* !publisher_type:oa' --entity-json --limit 300 \ + | jq 'select(.publisher_type != "oa")' -c \ + | python3 ./container_publisher_type.py oa \ + | fatcat-cli batch update container --description "Update container publisher_type" --auto-accept + + fatcat-cli search container 'publisher:"Walter de Gruyter" publisher_type:* !publisher_type:commercial !publisher_type:archive' --count + # 47 + + fatcat-cli search container 'publisher:"Walter de Gruyter" publisher_type:* !publisher_type:commercial !publisher_type:archive' --entity-json --limit 300 \ + | jq 'select(.publisher_type != "commercial")' -c \ + | python3 ./container_publisher_type.py commercial \ + | fatcat-cli batch update container --description "Update container publisher_type" --auto-accept + + fatcat-cli search container 'publisher:"springer" publisher_type:* !publisher_type:big5 !publisher_type:archive' --count + # 56 + + fatcat-cli search container 'publisher:"springer" publisher_type:* !publisher_type:big5 !publisher_type:archive' --entity-json --limit 300 \ + | jq 'select(.publisher_type != "big5")' -c \ + | python3 ./container_publisher_type.py big5 \ + | fatcat-cli batch update container --description "Update container publisher_type" --auto-accept + + fatcat-cli search container 'publisher:"elsevier" publisher_type:* !publisher_type:big5 !publisher_type:archive' --count + # 98 + + fatcat-cli search container 'publisher:"elsevier" publisher_type:* !publisher_type:big5 !publisher_type:archive' --entity-json --limit 300 \ + | jq 'select(.publisher_type != "big5")' -c \ + | python3 ./container_publisher_type.py big5 \ + | fatcat-cli batch update container --description "Update container publisher_type" --auto-accept + + fatcat-cli search container 'publisher:"wiley" publisher_type:* !publisher_type:big5 !publisher_type:archive' --count + # 37 + + fatcat-cli search container 'publisher:"wiley" publisher_type:* !publisher_type:big5 !publisher_type:archive' --entity-json --limit 300 \ + | jq 'select(.publisher_type != "big5")' -c \ + | python3 ./container_publisher_type.py big5 \ + | fatcat-cli batch update container --description "Update container publisher_type" --auto-accept + + fatcat-cli search container 'publisher:taylor publisher:francis publisher_type:* !publisher_type:big5 !publisher_type:archive' --count + # 558 + + fatcat-cli search container 'publisher:taylor publisher:francis publisher_type:* !publisher_type:big5 !publisher_type:archive' --entity-json --limit 300 \ + | jq 'select(.publisher_type != "big5")' -c \ + | python3 ./container_publisher_type.py big5 \ + | fatcat-cli batch update container --description "Update container publisher_type" --auto-accept + + fatcat-cli search container 'publisher:sage publisher_type:* !publisher_type:big5 !publisher_type:archive' --count + # 28 + + fatcat-cli search container 'publisher:sage publisher_type:* !publisher_type:big5 !publisher_type:archive' --entity-json --limit 300 \ + | jq 'select(.publisher_type != "big5")' -c \ + | python3 ./container_publisher_type.py big5 \ + | fatcat-cli batch update container --description "Update container publisher_type" --auto-accept + +Overall, around a thousand containers updated. Changes to releases will not be +reflected until they are re-indexed. diff --git a/extra/stats/2022-07-14-prod-stats.json b/extra/stats/2022-07-14-prod-stats.json new file mode 100644 index 00000000..62d06606 --- /dev/null +++ b/extra/stats/2022-07-14-prod-stats.json @@ -0,0 +1 @@ +{"changelog":{"latest":{"index":6036957,"timestamp":"2022-07-14T18:53:18.228827+00:00"}},"container":{"total":193300},"papers":{"in_kbart":78102604,"in_web":36247601,"in_web_not_kbart":18551021,"is_oa":25281045,"total":128995907},"release":{"refs_total":1340195856,"total":184966214}} diff --git a/extra/stats/2022-07-14-prod-table-sizes.txt b/extra/stats/2022-07-14-prod-table-sizes.txt new file mode 100644 index 00000000..b4fae69a --- /dev/null +++ b/extra/stats/2022-07-14-prod-table-sizes.txt @@ -0,0 +1,47 @@ +PostgreSQL 13.5 - wbgrp-svc502.us.archive.org +Size: 735.11G + + table_name | table_size | indexes_size | total_size +---------------------------------------+------------+--------------+------------ + "public"."release_contrib" | 88 GB | 32 GB | 121 GB + "public"."refs_blob" | 119 GB | 2200 MB | 121 GB + "public"."release_rev" | 85 GB | 25 GB | 110 GB + "public"."file_rev" | 36 GB | 29 GB | 65 GB + "public"."release_edit" | 18 GB | 21 GB | 39 GB + "public"."file_rev_url" | 31 GB | 8106 MB | 39 GB + "public"."abstracts" | 35 GB | 3671 MB | 39 GB + "public"."work_edit" | 17 GB | 20 GB | 37 GB + "public"."file_edit" | 18 GB | 16 GB | 34 GB + "public"."release_ident" | 12 GB | 12 GB | 23 GB + "public"."work_ident" | 12 GB | 11 GB | 23 GB + "public"."file_rev_release" | 8975 MB | 10 GB | 19 GB + "public"."file_ident" | 7775 MB | 7615 MB | 15 GB + "public"."work_rev" | 7753 MB | 5238 MB | 13 GB + "public"."release_ref" | 6721 MB | 5662 MB | 12 GB + "public"."release_rev_abstract" | 5035 MB | 7250 MB | 12 GB + "public"."webcapture_rev_cdx" | 4341 MB | 419 MB | 4760 MB + "public"."creator_edit" | 934 MB | 1042 MB | 1976 MB + "public"."creator_rev" | 928 MB | 730 MB | 1658 MB + "public"."editgroup" | 1294 MB | 256 MB | 1550 MB + "public"."creator_ident" | 631 MB | 647 MB | 1277 MB + "public"."release_rev_extid" | 524 MB | 649 MB | 1173 MB + "public"."changelog" | 383 MB | 301 MB | 685 MB + "public"."container_rev" | 249 MB | 60 MB | 308 MB + "public"."webcapture_edit" | 82 MB | 53 MB | 135 MB + "public"."container_edit" | 63 MB | 69 MB | 132 MB + "public"."webcapture_rev_url" | 65 MB | 22 MB | 87 MB + "public"."webcapture_rev_release" | 24 MB | 35 MB | 59 MB + "public"."webcapture_rev" | 45 MB | 14 MB | 59 MB + "public"."webcapture_ident" | 27 MB | 27 MB | 54 MB + "public"."container_ident" | 13 MB | 20 MB | 34 MB + "public"."auth_oidc" | 104 kB | 160 kB | 264 kB + "public"."editor" | 96 kB | 160 kB | 256 kB + "public"."editgroup_annotation" | 80 kB | 48 kB | 128 kB + "public"."fileset_rev_file" | 88 kB | 32 kB | 120 kB + "public"."fileset_edit" | 16 kB | 48 kB | 64 kB + "public"."fileset_rev_url" | 16 kB | 32 kB | 48 kB + "public"."fileset_rev_release" | 8192 bytes | 32 kB | 40 kB + "public"."fileset_ident" | 8192 bytes | 32 kB | 40 kB + "public"."fileset_rev" | 16 kB | 16 kB | 32 kB + "public"."__diesel_schema_migrations" | 8192 bytes | 16 kB | 24 kB +(41 rows) diff --git a/notes/merge_releases_examples.txt b/notes/merge_releases_examples.txt index ca65705e..a2f3e297 100644 --- a/notes/merge_releases_examples.txt +++ b/notes/merge_releases_examples.txt @@ -19,3 +19,6 @@ https://fatcat.wiki/release/search?q=NeuroTrends+Visualization 45 versions across two figshare works +https://fatcat.wiki/container/hqhtzu2pufgulc5pdw2tfx55e4 + + container has several DOAJ/DOI duplicates diff --git a/python/fatcat_tools/importers/chocula.py b/python/fatcat_tools/importers/chocula.py index 8c410d3e..38802bcb 100644 --- a/python/fatcat_tools/importers/chocula.py +++ b/python/fatcat_tools/importers/chocula.py @@ -136,6 +136,9 @@ class ChoculaImporter(EntityImporter): do_update = True if ce.extra.get("webarchive_urls") and not ce.extra.get("webarchive_urls", []): do_update = True + if ce.extra.get("publisher_type") and not ce.extra.get("publisher_type"): + # many older containers were missing this metadata + do_update = True for k in ("kbart", "ia", "doaj"): # always update these fields if not equal (chocula override) if ce.extra.get(k) and ce.extra[k] != existing.extra.get(k): diff --git a/python/fatcat_tools/importers/doaj_article.py b/python/fatcat_tools/importers/doaj_article.py index 8f5e7acf..64c05773 100644 --- a/python/fatcat_tools/importers/doaj_article.py +++ b/python/fatcat_tools/importers/doaj_article.py @@ -100,6 +100,10 @@ class DoajArticleImporter(EntityImporter): container_name = None break + if not container_id: + self.counts["skip-no-container"] += 1 + return None + volume = clean_str(bibjson["journal"].get("volume")) # NOTE: this schema seems to use "number" as "issue number" issue = clean_str(bibjson["journal"].get("number")) diff --git a/python/tests/files/ISSN-to-ISSN-L.snip.txt b/python/tests/files/ISSN-to-ISSN-L.snip.txt index 2569c443..96cfb4c0 100644 --- a/python/tests/files/ISSN-to-ISSN-L.snip.txt +++ b/python/tests/files/ISSN-to-ISSN-L.snip.txt @@ -9,3 +9,4 @@ ISSN ISSN-L 0000-0108 0002-0108 0000-0140 0002-0140 1877-3273 1877-3273 +1234-5678 1234-5678 diff --git a/python/tests/files/example_doaj_articles.json b/python/tests/files/example_doaj_articles.json index 018a4800..5e018176 100644 --- a/python/tests/files/example_doaj_articles.json +++ b/python/tests/files/example_doaj_articles.json @@ -1,5 +1,5 @@ -{"last_updated":"2020-02-04T14:11:44Z","bibjson":{"identifier":[{"id":"0264-1275","type":"pissn"},{"id":"10.1016/j.matdes.2016.06.110","type":"DOI"}],"journal":{"volume":"108","number":"","country":"GB","license":[{"open_access":true,"title":"CC BY-NC-ND","type":"CC BY-NC-ND","url":"https://www.elsevier.com/journals/materials-and-design/0264-1275/open-access-journal"}],"issns":["0264-1275","1873-4197"],"publisher":"Elsevier","language":["EN"],"title":"Materials & Design"},"month":"10","end_page":"617","year":"2016","start_page":"608","subject":[{"code":"TA401-492","scheme":"LCC","term":"Materials of engineering and construction. Mechanics of materials"}],"author":[{"affiliation":"State Key Laboratory for Mechanical Behavior of Materials, School of Materials Science and Engineering, Xi'an Jiaotong University, Xi'an 710049, China","name":"Xinfeng Li"},{"affiliation":"Department of Geosciences, Center for Materials by Design, State University of New York, Stony Brook, NY 11794-2100, USA","name":"Jin Zhang"},{"affiliation":"School of Chemical Engineering & Technology, China University of Mining and Technology, Xuzhou 221116, China","name":"Yanfei Wang"},{"affiliation":"State Key Laboratory for Mechanical Behavior of Materials, School of Materials Science and Engineering, Xi'an Jiaotong University, Xi'an 710049, China","name":"Sicong Shen"},{"affiliation":"State Key Laboratory for Mechanical Behavior of Materials, School of Materials Science and Engineering, Xi'an Jiaotong University, Xi'an 710049, China; Corresponding author.","name":"Xiaolong Song"}],"link":[{"type":"fulltext","url":"http://www.sciencedirect.com/science/article/pii/S0264127516308723"}],"abstract":"The tensile properties and fracture behavior of PH 13-8 Mo steel after subjected to pre-charged hydrogen were investigated by slow strain rate tensile tests. The results suggest that hydrogen slightly increases yield strength, while decreases tensile strength. The susceptibility to hydrogen embrittlement of specimens aged at 650 °C firstly reduces and then increases as the aging time increases, reaching the lowest value at aging time 4 h. This is dominantly attributed to the highest content of austenite. Moreover, hydrogen-induced crack nucleation sites initiate from lath, packet and prior austenite grain boundaries. Crack propagation passes through lath boundaries and walks along packet, prior austenite grain boundaries. Scanning electron microscopy result indicates that hydrogen-charged specimens show quasi-cleavage fracture and intergranular fracture in annular brittle zone while dimple fracture is observed in hydrogen-free specimens. Keywords: Hydrogen embrittlement, PH 13-8 Mo steel, Aging time, Fracture behavior","title":"Effect of hydrogen on tensile properties and fracture behavior of PH 13-8 Mo steel"},"created_date":"2019-06-05T05:25:15Z","id":"e58f08a11ecb495ead55a44ad4f89808"} -{"last_updated":"2020-02-04T08:06:42Z","bibjson":{"identifier":[{"id":"2072-6694","type":"eissn"},{"id":"10.3390/cancers9080107","type":"doi"}],"journal":{"volume":"9","number":"8","country":"CH","license":[{"open_access":true,"title":"CC BY","type":"CC BY","url":"http://www.mdpi.com/journal/cancers/about"}],"issns":["2072-6694"],"publisher":"MDPI AG","language":["EN"],"title":"Cancers"},"month":"8","keywords":["ALK rearrangement, lung cancer, biology, immunohistochemistry, FISH, molecular biology."],"year":"2017","start_page":"107","subject":[{"code":"RC254-282","scheme":"LCC","term":"Neoplasms. Tumors. Oncology. Including cancer and carcinogens"}],"author":[{"affiliation":"Laboratory of Clinical and Experimental Pathology, Pasteur Hospital, 30 avenue de la voie romaine, 06001 Nice cedex 01, France","name":"Paul Hofman"}],"link":[{"content_type":"pdf","type":"fulltext","url":"https://www.mdpi.com/2072-6694/9/8/107"}],"abstract":"Patients with advanced-stage non-small cell lung carcinoma (NSCLC) harboring an ALK rearrangement, detected from a tissue sample, can benefit from targeted ALK inhibitor treatment. Several increasingly effective ALK inhibitors are now available for treatment of patients. However, despite an initial favorable response to treatment, in most cases relapse or progression occurs due to resistance mechanisms mainly caused by mutations in the tyrosine kinase domain of ALK. The detection of an ALK rearrangement is pivotal and can be done using different methods, which have variable sensitivity and specificity depending, in particular, on the quality and quantity of the patient’s sample. This review will first highlight briefly some information regarding the pathobiology of an ALK rearrangement and the epidemiology of patients harboring this genomic alteration. The different methods used to detect an ALK rearrangement as well as their advantages and disadvantages will then be examined and algorithms proposed for detection in daily routine practice.","title":"ALK in Non-Small Cell Lung Cancer (NSCLC) Pathobiology, Epidemiology, Detection from Tumor Tissue and Algorithm Diagnosis in a Daily Practice"},"admin":{"seal":true},"created_date":"2018-10-26T07:49:34Z","id":"937c7aa790e048d4ae5f53a2ad71f0dc"} -{"last_updated":"2020-02-04T13:43:13Z","bibjson":{"identifier":[{"id":"1178-2013","type":"pissn"}],"end_page":"818","keywords":["bioconjugation","biosurfactant","cancer therapy","folic acid receptor","graphene quantum dots","theranostic tool"],"year":"2019","subject":[{"code":"R5-920","scheme":"LCC","term":"Medicine (General)"}],"author":[{"name":"Bansal S"},{"name":"Singh J"},{"name":"Kumari U"},{"name":"Kaur IP"},{"name":"Barnwal RP"},{"name":"Kumar R"},{"name":"Singh S"},{"name":"Singh G"},{"name":"Chatterjee M"}],"link":[{"content_type":"html","type":"fulltext","url":"https://www.dovepress.com/development-of-biosurfactant-based-graphene-quantum-dot-conjugate-as-a-peer-reviewed-article-IJN"}],"abstract":"Smriti Bansal,1 Joga Singh,2 Uma Kumari,3 Indu Pal Kaur,2 Ravi Pratap Barnwal,4 Ravinder Kumar,3 Suman Singh,5 Gurpal Singh,2 Mary Chatterjee1 1Biotechnology Engineering, University Institute of Engineering & Technology, Panjab University, Chandigarh, India; 2Department of Pharmaceutical Sciences, University Institute of Pharmaceutical Sciences, Panjab University, Chandigarh, India; 3Department of Zoology, Panjab University, Chandigarh, India; 4Department of Biophysics, Panjab University, Chandigarh, India; 5Department of Agronomics, Central Scientific Instruments Organisation, Chandigarh, India Background: Biosurfactants are amphipathic molecules of microbial origin that reduce surface and interfacial tension at gas–liquid–solid interfaces. Earlier, the biosurfactant was isolated and characterized in our laboratory from Candida parapsilosis. The property of the biosurfactant is further explored in this study by using quantum dots (QDs) as nanocarrier.Materials and methods: Graphene quantum dots (GQDs) were synthesized by bottom-up approach through pyrolysis of citric acid. GQDs were conjugated with both biosurfactant and folic acid (FA) using carbodiimide chemistry. The prepared GQD bioconjugate was studied for diagnostic and therapeutic effects against cancer cells.Results and discussion: Photoluminescence quantum yield (QY) of plain GQDs was measured as 12.8%. QY for biosurfactant conjugated GQDs and FA-biosurfactant conjugated GQDs was measured as 10.4% and 9.02%, respectively, and it was sufficient for targeting cancer cells. MTT assay showed that more than 90% of cells remained viable at concentration of 1 mg/mL, hence GQDs seemed to be non-toxic to cells. Biosurfactant conjugated GQDs caused 50% reduction in cellular viability within 24 hours. FA conjugation further increased the specificity of bioconjugated GQDs toward tumor cells, which is clearly evident from the drug internalization studies using confocal laser scanning microscopy. A higher amount of drug uptake was observed when bioconjugated GQDs were decorated with FA.Conclusion: The ability of GQD bioconjugate could be used as a theranostic tool for cancer. It is foreseen that in near future cancer can be detected and/or treated at an early stage by utilizing biosurfactant conjugated GQDs. Therefore, the proposed study would provide a stepping stone to improve the life of cancer patients. Keywords: bioconjugation, nanomedicine, nanocarrier, cancer therapy, folic acid receptor, graphene quantum dots","title":"Development of biosurfactant-based graphene quantum dot conjugate as a novel and fluorescent theranostic tool for cancer","journal":{"volume":"Volume 14","country":"GB","license":[{"open_access":true,"title":"CC BY-NC","type":"CC BY-NC","url":"https://www.dovepress.com/author_guidelines.php?content_id=695"}],"issns":["1176-9114","1178-2013"],"publisher":"Dove Medical Press","language":["EN"],"title":"International Journal of Nanomedicine"},"month":"1","start_page":"809"},"created_date":"2019-01-29T18:43:40Z","id":"e0173c80437f4fb88ec4e02e453e13b0"} -{"last_updated":"2020-02-04T09:46:14Z","bibjson":{"identifier":[{"id":"1424-8220","type":"eissn"},{"id":"10.3390/s18124467","type":"doi"}],"journal":{"volume":"18","number":"12","country":"CH","license":[{"open_access":true,"title":"CC BY","type":"CC BY","url":"http://www.mdpi.com/journal/sensors/about"}],"issns":["1424-8220"],"publisher":"MDPI AG","language":["EN"],"title":"Sensors"},"month":"12","keywords":["multilayer sea ice temperature","low temperature","design","performance analysis"],"year":"2018","start_page":"4467","subject":[{"code":"TP1-1185","scheme":"LCC","term":"Chemical technology"}],"author":[{"affiliation":"College of Electrical and Power Engineering, Taiyuan University of Technology, Taiyuan 030024, China","name":"Guangyu Zuo"},{"affiliation":"College of Electrical and Power Engineering, Taiyuan University of Technology, Taiyuan 030024, China","name":"Yinke Dou"},{"affiliation":"College of Water Resources Science and Engineering, Taiyuan University of Technology, Taiyuan 030024, China","name":"Xiaomin Chang"},{"affiliation":"College of Electrical and Power Engineering, Taiyuan University of Technology, Taiyuan 030024, China","name":"Yan Chen"},{"affiliation":"College of Electrical and Power Engineering, Taiyuan University of Technology, Taiyuan 030024, China","name":"Chunyan Ma"}],"link":[{"content_type":"pdf","type":"fulltext","url":"https://www.mdpi.com/1424-8220/18/12/4467"}],"abstract":"Temperature profiles of sea ice have been recorded more than a few decades. However, few high-precision temperature sensors can complete the observation of temperature profile of sea ice, especially in extreme environments. At present, the most widely used sea ice observation instruments can reach an accuracy of sea ice temperature measurement of 0.1 °C. In this study, a multilayer sea ice temperature sensor is developed with temperature measurement accuracy from −0.0047 °C to 0.0059 °C. The sensor system composition, structure of the thermistor string, and work mode are analyzed. The performance of the sensor system is evaluated from −50 °C to 30 °C. The temperature dependence of the constant current source, the amplification circuit, and the analog-to-digital converter (ADC) circuit are comprehensive tested and quantified. A temperature correction algorithm is designed to correct any deviation in the sensor system. A sea-ice thickness discrimination algorithm is proposed in charge of determining the thickness of sea ice automatically. The sensor system was field tested in Wuliangsuhai, Yellow River on 31 January 2018 and the second reservoir of Fen River, Yellow River on 30 January 2018. The integral practicality of this sensor system is identified and examined. The multilayer sea ice temperature sensor will provide good temperature results of sea ice and maintain stable performance in the low ambient temperature.","title":"Design and Performance Analysis of a Multilayer Sea Ice Temperature Sensor Used in Polar Region"},"admin":{"seal":true},"created_date":"2018-12-18T08:13:29Z","id":"152f83d12b9f477696e681684ba696e7"} -{"last_updated":"2020-06-02T23:02:32Z","bibjson":{"identifier":[{"id":"10.123/abc","type":"doi"},{"id":"2076-3417","type":"eissn"}],"journal":{"volume":"10","number":"3872","country":"CH","license":[{"open_access":true,"title":"CC BY","type":"CC BY","url":"http://www.mdpi.com/about/openaccess"}],"issns":["2076-3417"],"publisher":"MDPI AG","language":["EN"],"title":"Applied Sciences"},"month":"06","keywords":["Smart parking systems","survey","vehicle routing problem","vehicle detection techniques","routing algorithms"],"year":"2020","start_page":"3872","subject":[{"code":"T","scheme":"LCC","term":"Technology"},{"code":"TA1-2040","scheme":"LCC","term":"Engineering (General). Civil engineering (General)"},{"code":"QH301-705.5","scheme":"LCC","term":"Biology (General)"},{"code":"QC1-999","scheme":"LCC","term":"Physics"},{"code":"QD1-999","scheme":"LCC","term":"Chemistry"}],"author":[{"affiliation":"Institute of Computer Science. Faculty of Exact, Physical and Natural Sciences. National University of San Juan, 5400 San Juan, Argentina","name":"Mathias Gabriel Diaz Ogás"},{"affiliation":"Institute of Informatics and Applications. University of Girona, 17003 Girona, Spain","name":"Ramon Fabregat"},{"affiliation":"Institute of Computer Science. Faculty of Exact, Physical and Natural Sciences. National University of San Juan, 5400 San Juan, Argentina","name":"Silvana Aciar"}],"link":[{"content_type":"text/html","type":"fulltext","url":"https://www.mdpi.com/2076-3417/10/11/3872"}],"abstract":"The large number of vehicles constantly seeking access to congested areas in cities means that finding a public parking place is often difficult and causes problems for drivers and citizens alike. In this context, strategies that guide vehicles from one point to another, looking for the most optimal path, are needed. Most contributions in the literature are routing strategies that take into account different criteria to select the optimal route required to find a parking space. This paper aims to identify the types of smart parking systems (SPS) that are available today, as well as investigate the kinds of vehicle detection techniques (VDT) they have and the algorithms or other methods they employ, in order to analyze where the development of these systems is at today. To do this, a survey of 274 publications from January 2012 to December 2019 was conducted. The survey considered four principal features: SPS types reported in the literature, the kinds of VDT used in these SPS, the algorithms or methods they implement, and the stage of development at which they are. Based on a search and extraction of results methodology, this work was able to effectively obtain the current state of the research area. In addition, the exhaustive study of the studies analyzed allowed for a discussion to be established concerning the main difficulties, as well as the gaps and open problems detected for the SPS. The results shown in this study may provide a base for future research on the subject.","title":"Survey of Smart Parking Systems"},"admin":{"seal":true},"id":"9cf511bab39445ba9745feb43d7493dd","created_date":"2020-06-03T00:02:28Z"} +{"last_updated":"2020-02-04T14:11:44Z","bibjson":{"identifier":[{"id":"0264-1275","type":"pissn"},{"id":"10.1016/j.matdes.2016.06.110","type":"DOI"}],"journal":{"volume":"108","number":"","country":"GB","license":[{"open_access":true,"title":"CC BY-NC-ND","type":"CC BY-NC-ND","url":"https://www.elsevier.com/journals/materials-and-design/0264-1275/open-access-journal"}],"issns":["1234-5678","0264-1275","1873-4197"],"publisher":"Elsevier","language":["EN"],"title":"Materials & Design"},"month":"10","end_page":"617","year":"2016","start_page":"608","subject":[{"code":"TA401-492","scheme":"LCC","term":"Materials of engineering and construction. Mechanics of materials"}],"author":[{"affiliation":"State Key Laboratory for Mechanical Behavior of Materials, School of Materials Science and Engineering, Xi'an Jiaotong University, Xi'an 710049, China","name":"Xinfeng Li"},{"affiliation":"Department of Geosciences, Center for Materials by Design, State University of New York, Stony Brook, NY 11794-2100, USA","name":"Jin Zhang"},{"affiliation":"School of Chemical Engineering & Technology, China University of Mining and Technology, Xuzhou 221116, China","name":"Yanfei Wang"},{"affiliation":"State Key Laboratory for Mechanical Behavior of Materials, School of Materials Science and Engineering, Xi'an Jiaotong University, Xi'an 710049, China","name":"Sicong Shen"},{"affiliation":"State Key Laboratory for Mechanical Behavior of Materials, School of Materials Science and Engineering, Xi'an Jiaotong University, Xi'an 710049, China; Corresponding author.","name":"Xiaolong Song"}],"link":[{"type":"fulltext","url":"http://www.sciencedirect.com/science/article/pii/S0264127516308723"}],"abstract":"The tensile properties and fracture behavior of PH 13-8 Mo steel after subjected to pre-charged hydrogen were investigated by slow strain rate tensile tests. The results suggest that hydrogen slightly increases yield strength, while decreases tensile strength. The susceptibility to hydrogen embrittlement of specimens aged at 650 °C firstly reduces and then increases as the aging time increases, reaching the lowest value at aging time 4 h. This is dominantly attributed to the highest content of austenite. Moreover, hydrogen-induced crack nucleation sites initiate from lath, packet and prior austenite grain boundaries. Crack propagation passes through lath boundaries and walks along packet, prior austenite grain boundaries. Scanning electron microscopy result indicates that hydrogen-charged specimens show quasi-cleavage fracture and intergranular fracture in annular brittle zone while dimple fracture is observed in hydrogen-free specimens. Keywords: Hydrogen embrittlement, PH 13-8 Mo steel, Aging time, Fracture behavior","title":"Effect of hydrogen on tensile properties and fracture behavior of PH 13-8 Mo steel"},"created_date":"2019-06-05T05:25:15Z","id":"e58f08a11ecb495ead55a44ad4f89808"} +{"last_updated":"2020-02-04T08:06:42Z","bibjson":{"identifier":[{"id":"2072-6694","type":"eissn"},{"id":"10.3390/cancers9080107","type":"doi"}],"journal":{"volume":"9","number":"8","country":"CH","license":[{"open_access":true,"title":"CC BY","type":"CC BY","url":"http://www.mdpi.com/journal/cancers/about"}],"issns":["1234-5678","2072-6694"],"publisher":"MDPI AG","language":["EN"],"title":"Cancers"},"month":"8","keywords":["ALK rearrangement, lung cancer, biology, immunohistochemistry, FISH, molecular biology."],"year":"2017","start_page":"107","subject":[{"code":"RC254-282","scheme":"LCC","term":"Neoplasms. Tumors. Oncology. Including cancer and carcinogens"}],"author":[{"affiliation":"Laboratory of Clinical and Experimental Pathology, Pasteur Hospital, 30 avenue de la voie romaine, 06001 Nice cedex 01, France","name":"Paul Hofman"}],"link":[{"content_type":"pdf","type":"fulltext","url":"https://www.mdpi.com/2072-6694/9/8/107"}],"abstract":"Patients with advanced-stage non-small cell lung carcinoma (NSCLC) harboring an ALK rearrangement, detected from a tissue sample, can benefit from targeted ALK inhibitor treatment. Several increasingly effective ALK inhibitors are now available for treatment of patients. However, despite an initial favorable response to treatment, in most cases relapse or progression occurs due to resistance mechanisms mainly caused by mutations in the tyrosine kinase domain of ALK. The detection of an ALK rearrangement is pivotal and can be done using different methods, which have variable sensitivity and specificity depending, in particular, on the quality and quantity of the patient’s sample. This review will first highlight briefly some information regarding the pathobiology of an ALK rearrangement and the epidemiology of patients harboring this genomic alteration. The different methods used to detect an ALK rearrangement as well as their advantages and disadvantages will then be examined and algorithms proposed for detection in daily routine practice.","title":"ALK in Non-Small Cell Lung Cancer (NSCLC) Pathobiology, Epidemiology, Detection from Tumor Tissue and Algorithm Diagnosis in a Daily Practice"},"admin":{"seal":true},"created_date":"2018-10-26T07:49:34Z","id":"937c7aa790e048d4ae5f53a2ad71f0dc"} +{"last_updated":"2020-02-04T13:43:13Z","bibjson":{"identifier":[{"id":"1178-2013","type":"pissn"}],"end_page":"818","keywords":["bioconjugation","biosurfactant","cancer therapy","folic acid receptor","graphene quantum dots","theranostic tool"],"year":"2019","subject":[{"code":"R5-920","scheme":"LCC","term":"Medicine (General)"}],"author":[{"name":"Bansal S"},{"name":"Singh J"},{"name":"Kumari U"},{"name":"Kaur IP"},{"name":"Barnwal RP"},{"name":"Kumar R"},{"name":"Singh S"},{"name":"Singh G"},{"name":"Chatterjee M"}],"link":[{"content_type":"html","type":"fulltext","url":"https://www.dovepress.com/development-of-biosurfactant-based-graphene-quantum-dot-conjugate-as-a-peer-reviewed-article-IJN"}],"abstract":"Smriti Bansal,1 Joga Singh,2 Uma Kumari,3 Indu Pal Kaur,2 Ravi Pratap Barnwal,4 Ravinder Kumar,3 Suman Singh,5 Gurpal Singh,2 Mary Chatterjee1 1Biotechnology Engineering, University Institute of Engineering & Technology, Panjab University, Chandigarh, India; 2Department of Pharmaceutical Sciences, University Institute of Pharmaceutical Sciences, Panjab University, Chandigarh, India; 3Department of Zoology, Panjab University, Chandigarh, India; 4Department of Biophysics, Panjab University, Chandigarh, India; 5Department of Agronomics, Central Scientific Instruments Organisation, Chandigarh, India Background: Biosurfactants are amphipathic molecules of microbial origin that reduce surface and interfacial tension at gas–liquid–solid interfaces. Earlier, the biosurfactant was isolated and characterized in our laboratory from Candida parapsilosis. The property of the biosurfactant is further explored in this study by using quantum dots (QDs) as nanocarrier.Materials and methods: Graphene quantum dots (GQDs) were synthesized by bottom-up approach through pyrolysis of citric acid. GQDs were conjugated with both biosurfactant and folic acid (FA) using carbodiimide chemistry. The prepared GQD bioconjugate was studied for diagnostic and therapeutic effects against cancer cells.Results and discussion: Photoluminescence quantum yield (QY) of plain GQDs was measured as 12.8%. QY for biosurfactant conjugated GQDs and FA-biosurfactant conjugated GQDs was measured as 10.4% and 9.02%, respectively, and it was sufficient for targeting cancer cells. MTT assay showed that more than 90% of cells remained viable at concentration of 1 mg/mL, hence GQDs seemed to be non-toxic to cells. Biosurfactant conjugated GQDs caused 50% reduction in cellular viability within 24 hours. FA conjugation further increased the specificity of bioconjugated GQDs toward tumor cells, which is clearly evident from the drug internalization studies using confocal laser scanning microscopy. A higher amount of drug uptake was observed when bioconjugated GQDs were decorated with FA.Conclusion: The ability of GQD bioconjugate could be used as a theranostic tool for cancer. It is foreseen that in near future cancer can be detected and/or treated at an early stage by utilizing biosurfactant conjugated GQDs. Therefore, the proposed study would provide a stepping stone to improve the life of cancer patients. Keywords: bioconjugation, nanomedicine, nanocarrier, cancer therapy, folic acid receptor, graphene quantum dots","title":"Development of biosurfactant-based graphene quantum dot conjugate as a novel and fluorescent theranostic tool for cancer","journal":{"volume":"Volume 14","country":"GB","license":[{"open_access":true,"title":"CC BY-NC","type":"CC BY-NC","url":"https://www.dovepress.com/author_guidelines.php?content_id=695"}],"issns":["1234-5678","1176-9114","1178-2013"],"publisher":"Dove Medical Press","language":["EN"],"title":"International Journal of Nanomedicine"},"month":"1","start_page":"809"},"created_date":"2019-01-29T18:43:40Z","id":"e0173c80437f4fb88ec4e02e453e13b0"} +{"last_updated":"2020-02-04T09:46:14Z","bibjson":{"identifier":[{"id":"1424-8220","type":"eissn"},{"id":"10.3390/s18124467","type":"doi"}],"journal":{"volume":"18","number":"12","country":"CH","license":[{"open_access":true,"title":"CC BY","type":"CC BY","url":"http://www.mdpi.com/journal/sensors/about"}],"issns":["1234-5678","1424-8220"],"publisher":"MDPI AG","language":["EN"],"title":"Sensors"},"month":"12","keywords":["multilayer sea ice temperature","low temperature","design","performance analysis"],"year":"2018","start_page":"4467","subject":[{"code":"TP1-1185","scheme":"LCC","term":"Chemical technology"}],"author":[{"affiliation":"College of Electrical and Power Engineering, Taiyuan University of Technology, Taiyuan 030024, China","name":"Guangyu Zuo"},{"affiliation":"College of Electrical and Power Engineering, Taiyuan University of Technology, Taiyuan 030024, China","name":"Yinke Dou"},{"affiliation":"College of Water Resources Science and Engineering, Taiyuan University of Technology, Taiyuan 030024, China","name":"Xiaomin Chang"},{"affiliation":"College of Electrical and Power Engineering, Taiyuan University of Technology, Taiyuan 030024, China","name":"Yan Chen"},{"affiliation":"College of Electrical and Power Engineering, Taiyuan University of Technology, Taiyuan 030024, China","name":"Chunyan Ma"}],"link":[{"content_type":"pdf","type":"fulltext","url":"https://www.mdpi.com/1424-8220/18/12/4467"}],"abstract":"Temperature profiles of sea ice have been recorded more than a few decades. However, few high-precision temperature sensors can complete the observation of temperature profile of sea ice, especially in extreme environments. At present, the most widely used sea ice observation instruments can reach an accuracy of sea ice temperature measurement of 0.1 °C. In this study, a multilayer sea ice temperature sensor is developed with temperature measurement accuracy from −0.0047 °C to 0.0059 °C. The sensor system composition, structure of the thermistor string, and work mode are analyzed. The performance of the sensor system is evaluated from −50 °C to 30 °C. The temperature dependence of the constant current source, the amplification circuit, and the analog-to-digital converter (ADC) circuit are comprehensive tested and quantified. A temperature correction algorithm is designed to correct any deviation in the sensor system. A sea-ice thickness discrimination algorithm is proposed in charge of determining the thickness of sea ice automatically. The sensor system was field tested in Wuliangsuhai, Yellow River on 31 January 2018 and the second reservoir of Fen River, Yellow River on 30 January 2018. The integral practicality of this sensor system is identified and examined. The multilayer sea ice temperature sensor will provide good temperature results of sea ice and maintain stable performance in the low ambient temperature.","title":"Design and Performance Analysis of a Multilayer Sea Ice Temperature Sensor Used in Polar Region"},"admin":{"seal":true},"created_date":"2018-12-18T08:13:29Z","id":"152f83d12b9f477696e681684ba696e7"} +{"last_updated":"2020-06-02T23:02:32Z","bibjson":{"identifier":[{"id":"10.123/abc","type":"doi"},{"id":"2076-3417","type":"eissn"}],"journal":{"volume":"10","number":"3872","country":"CH","license":[{"open_access":true,"title":"CC BY","type":"CC BY","url":"http://www.mdpi.com/about/openaccess"}],"issns":["1234-5678","2076-3417"],"publisher":"MDPI AG","language":["EN"],"title":"Applied Sciences"},"month":"06","keywords":["Smart parking systems","survey","vehicle routing problem","vehicle detection techniques","routing algorithms"],"year":"2020","start_page":"3872","subject":[{"code":"T","scheme":"LCC","term":"Technology"},{"code":"TA1-2040","scheme":"LCC","term":"Engineering (General). Civil engineering (General)"},{"code":"QH301-705.5","scheme":"LCC","term":"Biology (General)"},{"code":"QC1-999","scheme":"LCC","term":"Physics"},{"code":"QD1-999","scheme":"LCC","term":"Chemistry"}],"author":[{"affiliation":"Institute of Computer Science. Faculty of Exact, Physical and Natural Sciences. National University of San Juan, 5400 San Juan, Argentina","name":"Mathias Gabriel Diaz Ogás"},{"affiliation":"Institute of Informatics and Applications. University of Girona, 17003 Girona, Spain","name":"Ramon Fabregat"},{"affiliation":"Institute of Computer Science. Faculty of Exact, Physical and Natural Sciences. National University of San Juan, 5400 San Juan, Argentina","name":"Silvana Aciar"}],"link":[{"content_type":"text/html","type":"fulltext","url":"https://www.mdpi.com/2076-3417/10/11/3872"}],"abstract":"The large number of vehicles constantly seeking access to congested areas in cities means that finding a public parking place is often difficult and causes problems for drivers and citizens alike. In this context, strategies that guide vehicles from one point to another, looking for the most optimal path, are needed. Most contributions in the literature are routing strategies that take into account different criteria to select the optimal route required to find a parking space. This paper aims to identify the types of smart parking systems (SPS) that are available today, as well as investigate the kinds of vehicle detection techniques (VDT) they have and the algorithms or other methods they employ, in order to analyze where the development of these systems is at today. To do this, a survey of 274 publications from January 2012 to December 2019 was conducted. The survey considered four principal features: SPS types reported in the literature, the kinds of VDT used in these SPS, the algorithms or methods they implement, and the stage of development at which they are. Based on a search and extraction of results methodology, this work was able to effectively obtain the current state of the research area. In addition, the exhaustive study of the studies analyzed allowed for a discussion to be established concerning the main difficulties, as well as the gaps and open problems detected for the SPS. The results shown in this study may provide a base for future research on the subject.","title":"Survey of Smart Parking Systems"},"admin":{"seal":true},"id":"9cf511bab39445ba9745feb43d7493dd","created_date":"2020-06-03T00:02:28Z"} diff --git a/python/tests/import_doaj.py b/python/tests/import_doaj.py index 9c4ba552..81f17d7c 100644 --- a/python/tests/import_doaj.py +++ b/python/tests/import_doaj.py @@ -144,8 +144,7 @@ def test_doaj_dict_parse(doaj_importer): assert r.pages == "608-617" assert r.version is None assert r.language == "en" - # matched by ISSN, so wouldn't be defined normally - assert r.extra["container_name"] == "Materials & Design" + assert r.container_id assert len(r.abstracts) == 1 assert len(r.abstracts[0].content) == 1033 assert len(r.contribs) == 5 |