aboutsummaryrefslogtreecommitdiffstats
path: root/notes/data_issues.md
blob: 87a91b94d1ceb6887eab6f6e941eef6e32f6e20c (plain)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
# Data issues specifically in Citation Graph

## Repeated entries

* 2020-04-19
* https://qa.fatcat.wiki/release/lcarb5rg5vf3tk4hpvosja5sm4/outbound-refs

A DOI seems to be using the key, which leads to repeated entries.

> 2021-07-02: Solved, kind of. We get rid of various duplicates in a
> post-processing step. It would still be better to not generate these in the
> first place.

## Self references

* 2020-04-19
* https://qa.fatcat.wiki/release/3fcp4pk7nfamvkbjekqam24bfq/outbound-refs

The source and target seem to be the same.

> 2021-07-02: Solved in post-processing, for now.

## Duplicated Edges

* 2020-04-20
* https://qa.fatcat.wiki/release/22222736evcc7kdn3bleua3fge/outbound-refs
* found 16/1M

Source and target are the same, maybe DOI with ref key?

> 2021-07-02: Solved in post-processing, for now.