aboutsummaryrefslogtreecommitdiffstats
path: root/scalding/src/main/scala/sandcrawler/Scorable.scala
Commit message (Collapse)AuthorAgeFilesLines
* increase MaxTitleLength from 255 to 1023Bryan Newbold2018-08-231-1/+1
| | | | | | Motivated after finding some long titles with MathML mixed in. Until this issue can be investigated further, bumping this limit to pass the handful of matches found.
* Added title length filtering to GrobidScorableEllen Spertus2018-08-221-0/+1
|
* Factored out ScorableFeatures.Ellen Spertus2018-08-131-30/+0
|
* Pipeline works, all tests pass, no scalastyle errors.Ellen Spertus2018-08-131-1/+1
|
* Snapshot before changing Scorable to find bug.Ellen Spertus2018-08-121-1/+0
|
* Tests pass.Ellen Spertus2018-08-121-5/+6
|
* It compiles.Ellen Spertus2018-08-111-12/+28
|
* It compilesEllen Spertus2018-08-101-3/+3
|
* Broken code to share with Bryan.Ellen Spertus2018-08-091-1/+1
|
* WIPEllen Spertus2018-08-091-2/+3
|
* WIPEllen Spertus2018-08-091-4/+5
|
* Removed implicit parameters. Does not compile.Ellen Spertus2018-08-091-3/+3
|
* WIPEllen Spertus2018-08-091-1/+1
|
* Fixed scalastyle violations.Ellen Spertus2018-08-091-4/+3
|
* Added test for null argument to titleToSlug()Ellen Spertus2018-08-091-4/+9
|
* Added punctuation removal to slug creation and similarity comparisonsEllen Spertus2018-08-071-1/+2
|
* Minor refactoring. Added test.Ellen Spertus2018-08-071-9/+6
|
* Removed commented-out code.Ellen Spertus2018-08-071-29/+0
|
* Added CrossrefScorable.scala. All code compiles.Ellen Spertus2018-08-071-2/+2
|
* New code compiles. Old tests pass. New tests not yet written.Ellen Spertus2018-08-061-3/+6
|
* Partly refactored HBaseCrossrefScoreJob. Everything compiles.Ellen Spertus2018-08-061-0/+115