aboutsummaryrefslogtreecommitdiffstats
path: root/scalding/src/main/scala/sandcrawler/GrobidScorable.scala
Commit message (Collapse)AuthorAgeFilesLines
* replaced NoSlug with proper use of OptionEllen Spertus2018-09-041-4/+4
|
* clean up commented out code in scalding/Bryan Newbold2018-08-241-3/+2
|
* author parsing (and year, for crossref)Bryan Newbold2018-08-231-1/+13
|
* Added title length filtering to GrobidScorableEllen Spertus2018-08-221-0/+16
|
* use grobid0:metadata, not tei_jsonBryan Newbold2018-08-211-5/+5
| | | | | This is for efficiency. I had forgotten that the extract script actually writes this path!
* Created static factory method for ScorableCreations to deal with null.Ellen Spertus2018-08-201-1/+1
|
* handle null status_code linesBryan Newbold2018-08-151-0/+1
|
* grobid scoring: status_code as signed int, not stringBryan Newbold2018-08-151-2/+7
|
* Now ignores grobid entries with status other than 200.Ellen Spertus2018-08-141-3/+7
|
* Factored out ScorableFeatures.Ellen Spertus2018-08-131-5/+1
|
* Pipeline works, all tests pass, no scalastyle errors.Ellen Spertus2018-08-131-2/+1
|
* It compiles.Ellen Spertus2018-08-111-11/+10
|
* It compilesEllen Spertus2018-08-101-3/+4
|
* Broken code to share with Bryan.Ellen Spertus2018-08-091-1/+1
|
* WIPEllen Spertus2018-08-091-2/+3
|
* WIPEllen Spertus2018-08-091-4/+5
|
* Removed implicit parameters. Does not compile.Ellen Spertus2018-08-091-1/+1
|
* WIPEllen Spertus2018-08-091-8/+7
|
* Fixed scalastyle violations.Ellen Spertus2018-08-091-12/+9
|
* Removed HBaseCrossrefScore{Job,Test} and references thereto.Ellen Spertus2018-08-071-3/+5
|
* Added GrobidScorableTest, minor improvements.Ellen Spertus2018-08-071-9/+15
|
* Added CrossrefScorable.scala. All code compiles.Ellen Spertus2018-08-071-8/+5
|
* New code compiles. Old tests pass. New tests not yet written.Ellen Spertus2018-08-061-0/+48