diff options
Diffstat (limited to 'docs')
-rw-r--r-- | docs/concepts.md | 84 | ||||
-rw-r--r-- | docs/contents.json | 12 | ||||
-rw-r--r-- | docs/cookbook/tutorial.md | 50 | ||||
-rw-r--r-- | docs/how-dat-works.md | 4 | ||||
-rw-r--r-- | docs/install.md | 55 | ||||
-rw-r--r-- | docs/intro.md | 48 | ||||
-rw-r--r-- | docs/overview.md | 92 | ||||
-rw-r--r-- | docs/terms.md | 8 | ||||
-rw-r--r-- | docs/troubleshooting.md | 9 | ||||
-rw-r--r-- | docs/tutorial.md | 74 |
10 files changed, 250 insertions, 186 deletions
diff --git a/docs/concepts.md b/docs/concepts.md new file mode 100644 index 0000000..acae740 --- /dev/null +++ b/docs/concepts.md @@ -0,0 +1,84 @@ +# Key Concepts + +## In Place Archiving + +You can turn any folder on your computer into *a dat*. We call this *in place archiving*. A dat is a regular folder with some magic attached. The magic is a set of metadata files, in a `.dat` folder. Dat uses the metadata to track file history and securely share your files. Your files and the `.dat` folder can be instantly synced to anywhere. + +<img src="/assets/dat_folder.png" alt="Create a dat with any folder" style="width:500px;"/> + +Once installing Dat, you can use a single command to live sync your files to friends, backup to an external drive, and publish to a website (so people can download over http too!). The cool part is this all happens at the same time. If you go offline for a bit, no worries. Dat shares the latest files and any saved history once you are back online. These data transfers happen between the computers, forgoing any centralized source. + +In place archiving in Dat really means **any place**. Dat seamlessly syncs your files where you want and when you want. Dat's decentralized technology and automatic versioning will improve data availability and data quality without sacrificing ease of use. + +## Distributed Network + +Dat goes beyond regular archiving through it's *distributed network*. When you share data, Dat sends data to many download locations at once, and they can sync the same data with each other! By connecting users directly Dat transfers files faster, especially sharing on a local network. Distributed syncing allows robust global archiving for public data. + +<img src="/assets/share_link.png" alt="Share unique dat link" style="width:500px;"/> + +To maintain privacy, the dat link controls access to your data. Any data shared in the network is encrypted using your link as the password. Learn more about Dat's securtiy and privacy below or in [the faqs](faq#security-and-privacy). We are also investigating ways to improve [reader privacy](https://blog.datproject.org/2016/12/12/reader-privacy-on-the-p2p-web/) for public data. + +## Version History + +Dat automatically maintains a built in version history whenever files are added. Dat uses this history to allow partial downloads of files, for example only getting the latest files. There are two types of versioning performed automatically by Dat. Metadata is stored in a folder called `.dat` in the main folder of a repository, and data is stored as normal files in the main folder. + +Dat uses append-only registers to store version history. This means all changes are written to the end of the file, growing over time. + +### Metadata Versioning + +Dat acts as a one-to-one mirror of the state of a folder and all it's contents. When importing files, Dat grabs the filesystem metadata for each file and checks if there is already an entry for this filename. If the file with this metadata matches exactly the newest version of the file metadata stored in Dat, then this file will be skipped (no change). + +If the metadata differs or does not exist, then this new metadata entry will be appended as the new 'latest' version for this file in the append-only SLEEP metadata content register. + +### Content Versioning + +The metadata only tells you if or when a file is changed, now how it changed. In addition to the metadata, Dat tracks changes in the content in a similar manner. + +The default storage system used in Dat stores the files as files. This has the advantage of being very straightforward for users to understand, but the downside of not storing old versions of content by default. + +In contrast to other version control systems, like Git, Dat only stores the current set of files, not older versions. Git, for example, stores all previous content versions and all previous metadata versions in the `.git` folder. But Dat is designed for larger datasets. + +Storing all history on content could easily fill up the users hard drive. Dat has multiple storage modes based on usage. With Dat's dynamic storage, you can store the content history on a local external hard drive or on a remote server (or both!). + +## Dat Privacy + +Files shared with Dat are encrypted (using the link) so *only* users with your unique link can access your files. The link acts as a kind of password meaning, generally, you should assume *anyone* with the link will have access to your files. + +The link allows users to download, and re-share, your files, whether you intended them to have the link or not (with some hand waiving assumptions about them being able to connect to you, which can be limited, see more in [security & privacy faq](faq#security-and-privacy)). + +Make sure you are thoughtful about who you share links with and how. Dat ensures links cannot be intercepted through the Dat network. If you share your links over other channels, ensure the privacy & security matches or exceeds your data security needs. We try to limit times when Dat displays full links to avoid accidental sharing. + +## dat:// links + +Dat links have some special properties that are helpful to understand. + +Traditionally, http links point to a specific server, e.g. datproject.org's server, and/or a specific resource on that server. Unfortunately, links often break or the content changes without notification (this makes it impossible to cite `nytimes.com`, for example, because the link is meaningless without a reference to what content was there at citation time). Dat links, on the other hand, never change. You can update data in a dat and use the same link to download the changes. + +Here is an example dat link: + +``` +dat://ff34725120b2f3c5bd5028e4f61d14a45a22af48a7b12126d5d588becde88a93 +``` + +What is with the weird long string of characters? Let's break it down! + +**`dat://` - the protocol** + +The first part of the link is the link protocol, Dat (read about the Dat protocol at [datprotocol.com](http://www.datprotocol.com)). The protocol describes what "language" the link is in and what type of applications can open it. You do not always need this part with Dat but it is helpful context. + +**`ff34725120b2f3c5bd5028e4f61d14a45a22af48a7b12126d5d588becde88a93` - the unique identifier** + +The second part of the link is a 64-character hex strings ([ed25519 public-keys](https://ed25519.cr.yp.to/) to be precise). Each Dat archive gets a public key link to identify it. With the hex string as a link we can a few things: + +1. Encrypt the data transfer +2. Create a persistent identifier, an ID that never changes, even as file are updated (as opposed to a checksum which is based on the file contents). + +**`dat://ff34725120b2f3c5bd5028e4f61d14a45a22af48a7b12126d5d588becde88a93`** + +All together, the links can be thought of similarly to a web URL, as a place to get content, but with some extra special properties. When you download a dat link: + +1. You do not have to worry about where the files are stored. +2. You can always get the latest files available. +3. You can view the version history or add version numbers to links to get an permanent link to a specific version. + +[If you'd like you read more about how dat works, see our whitepaper.](https://github.com/datproject/docs/blob/master/papers/dat-paper.pdf) diff --git a/docs/contents.json b/docs/contents.json index f87b1f1..fb1eb16 100644 --- a/docs/contents.json +++ b/docs/contents.json @@ -1,14 +1,15 @@ { "Using Dat": { "Introduction": "intro.md", - "Dat Concepts": "overview.md", - "Command Line": "modules/dat.md", - "FAQ": "faq.md", + "Installation": "install.md", + "Getting Started": "tutorial.md", + "Key Concepts": "concepts.md", + "Terminology": "terms.md", + "Frequently Asked Questions": "faq.md", "Troubleshooting": "troubleshooting.md", - "Terminology": "terms.md" + "About The Project": "overview.md" }, "Cookbook": { - "Getting Started": "cookbook/tutorial.md", "Sharing files over HTTP": "cookbook/http.md", "Running a Dat Server": "cookbook/server.md", "In the Browser": "cookbook/browser.md", @@ -17,6 +18,7 @@ }, "Dat Technology": { "Overview": "ecosystem.md", + "Command Line": "modules/dat.md", "dat-node": "modules/dat-node.md", "hyperdrive": "modules/hyperdrive.md", "hypercore": "modules/hypercore.md", diff --git a/docs/cookbook/tutorial.md b/docs/cookbook/tutorial.md deleted file mode 100644 index 58f5288..0000000 --- a/docs/cookbook/tutorial.md +++ /dev/null @@ -1,50 +0,0 @@ -# Getting Started with Dat - -In this tutorial we will go through the two main ways to use Dat, sharing data and downloading data. If possible, this is great to go through with a partner to see how Dat works across computers. Get Dat [installed](intro#installation) and get started! - -Dat Desktop makes it easy for anyone to get started using Dat with user-friendly interface. If you are comfortable with the command line then you can install dat via npm. You can always switch apps later and keep your dats the same. Dat can share your files to anyone, it does not matter how they are using Dat. - -## Command Line Tutorial - -### Downloading Data - -We made a demo folder we made just for this exercise. Inside the demo folder is a `dat.json` file and a gif. We shared these files via Dat and now you can download them with our dat key! - -Similar to git, you do download somebody's dat by running `dat clone <link>`. You can also specify the directory: - -``` -❯ dat clone dat://778f8d955175c92e4ced5e4f5563f69bfec0c86cc6f670352c457943666fe639 ~/Downloads/dat-demo -dat v13.5.0 -Created new dat in /Users/joe/Downloads/dat-demo/.dat -Cloning: 2 files (1.4 MB) - -2 connections | Download 614 KB/s Upload 0 B/s - -dat sync complete. -Version 4 -``` - -This will download our demo files to the `~/downloads/dat-demo` folder. These files are being shared by a server over Dat (to ensure high availability) but you may connect to any number of users also hosting the content. - -You can also also view the files online: [datproject.org/778f8d955175c92e4ced5e4f5563f69bfec0c86cc6f670352c457943666fe639](https://datproject.org/778f8d955175c92e4ced5e4f5563f69bfec0c86cc6f670352c457943666fe639/). datproject.org can download files over Dat and display them on http as long as someone is hosting it. The website temporarily caches data for any visited links (do not view your dat on datproject.org if you do not want us caching your data). - -### Sharing Data - -We'll be creating a dat from a folder on your computer. If you are with a friend you can sync these files to their computer. Otherwise you can view them online via datproject.org to see how viewing a dat online works. - -Find a folder on your computer to share. Any kind of files work with Dat but for now, make sure it's something you want to share with your friends. Dat can handle all sorts of files (Dat works with really big folders too!). We like cat pictures. - -First, you can create a new dat inside that folder. Using the `dat create` command also walks us through making a `dat.json` file: - -``` -❯ dat create -Welcome to dat program! -You can turn any folder on your computer into a Dat. -A Dat is a folder with some magic. -``` - -This will create a new (empty) dat. Dat will print a link, share this link to give others access to view your files. - -Once we have our dat, run `dat share` to scan your files and sync them to the network. Share the link with your friend to instantly start downloading files. - -You can also try viewing your files online. Go to [datproject.org](https://datproject.org/explore) and enter your link to preview on the top right. *(Some users, including me when writing this, may have trouble connecting to datproject.org initially. Don't be alarmed! It is something we are working on. Thanks.)* diff --git a/docs/how-dat-works.md b/docs/how-dat-works.md index fe40a4d..91ee5cd 100644 --- a/docs/how-dat-works.md +++ b/docs/how-dat-works.md @@ -1,8 +1,8 @@ # How Dat Works -Note this is about Dat 2.0 and later. For historical info about earlier incarnations of Dat (Alpha, Beta) check out [this post](http://dat-data.com/blog/2016-01-19-brief-history-of-dat). +Note this is about Dat 2.0 and later. -Read the [dat whitepaper](https://github.com/datproject/docs/tree/master/papers) for technical details. +Read the [dat whitepaper](https://github.com/datproject/docs/tree/master/papers/dat-paper.md) for technical details. When someone starts downloading data with the [Dat command-line tool](https://github.com/datproject/dat), here's what happens: diff --git a/docs/install.md b/docs/install.md new file mode 100644 index 0000000..0d53b54 --- /dev/null +++ b/docs/install.md @@ -0,0 +1,55 @@ +# Welcome to Dat! + +Dat has a Desktop client, a commandline tool, and a Node.js library. If you'd like to read about how dat works, please [read how it works](/concepts) and if you're still hungry for more learning, [read the Dat paper](https://github.com/datproject/docs/blob/master/papers/dat-paper.pdf). + +Have questions or need some guidance? You can chat with us in IRC on [#dat](http://webchat.freenode.net/?channels=dat) or [Gitter](https://gitter.im/datproject/discussions?utm_source=badge&utm_medium=badge&utm_campaign=pr-badge&utm_content=badge)! + +## Desktop Application + +If you don't want to use the terminal, you can use our desktop application on Mac or Linux (Windows coming soon). + +| Platform | Link | +|---------|-------------------| +| Mac | [Download .dmg](http://datproject.github.io/dat-desktop/mac) | +| Linux | [Download .AppImage](http://datproject.github.io/dat-desktop/linux) | +| Windows | Coming Soon | + + +## In the Terminal + +Dat can be installed in the terminal using `node`. Follow the instructions below to get started. + +1. **Install Node.** Dat requires Node version 4.0 or higher; however, we recommend the latest version. If you don't have node, [go to their website at nodejs.org and pick your platform.](https://nodejs.org/en/download/) If node is installed, you should be able to type the following to see which version you have: + +``` +$ node -v +8.0.0 +``` + +2. **Install Dat.** Dat is distributed using `npm`, the package manager for Node.js. Type the following command to install dat: + +``` +npm install -g dat +``` + +If dat was installed successfully, you might see output like this (on npm 5.0.0): +``` +/usr/local/bin/dat -> /usr/local/lib/node_modules/dat/bin/cli.js + +> utp-native@1.5.1 install /usr/local/lib/node_modules/dat/node_modules/utp-native +> node-gyp-build + + +> sodium-native@1.10.0 install /usr/local/lib/node_modules/dat/node_modules/sodium-native +> node-gyp-build "node preinstall.js" "node postinstall.js" + +added 321 packages in 9.662s +``` + +If you receive an `EACCES` error, read [this guide on fixing npm permissions](https://docs.npmjs.com/getting-started/fixing-npm-permissions) or use `sudo npm install -g dat`. + +If you're still having trouble installing dat, see the [troubleshooting section](/troubleshooting), [open an issue on Github](https://github.com/datproject/dat/issues/new), or [ask us a question in our chat room](https://gitter.im/datproject/discussions). + +## Next Steps + +You're all set! [Go on to the next page to start sharing data](/tutorial). diff --git a/docs/intro.md b/docs/intro.md index bcf78ff..7810a69 100644 --- a/docs/intro.md +++ b/docs/intro.md @@ -1,47 +1,33 @@ -# Welcome to Dat Docs! +# Welcome to Dat -Dat is the distributed data tool. +Ever tried moving large files and folders to other computers? Usually this involves one of a few strategies: being in the same location (usb stick), using a cloud service (Dropbox), or using old but reliable technical tools (rsync). None of these easily store, track, and share your data securely over time. People often are stuck choosing between security, speed, or ease of use. Dat provides all three by using a state of the art technical foundation and user friendly tools for fast and secure file sharing that you control. -Dat's open source applications offer a new experience in advanced file syncing and publishing. Wherever your data goes, Dat uses innovative *in place archiving* to link files from many locations together. Share data with anyone over a distributed network using encrypted connections. Dat brings a new ease to public data management with automatic version history, persistent links, and dynamic storage. +Dat is free software built for the public by Code for Science & Society, a nonprofit. Researchers, analysts, libraries, and universities are [already using dat](https://www.nytimes.com/2017/03/06/science/donald-trump-data-rescue-science.html) to archive and distribute scientific data. Developers are building applications on Dat for [browsing peer-to-peer websites](beakerbrowser.com) and [offline editable maps](https://www.digital-democracy.org/blog/update-from-the-ecuadorian-amazon/). Anyone can use Dat to backup files or share those cute cat pictures with a friend. Install and get started today by using the desktop application, command line, or JavaScript library. -Use Dat to distribute scientific data, browse remote files on demand, or run continuous file archiving. Integrate into your existing work flow with flexible storage options and http publishing. Dat connects existing web infrastructure with a modern technological foundation. Built on a decentralized network, Dat creates new opportunities for existing data publishing tools. Put data preservation at your finger tips, like never before, with user-first applications. **Secure**, **distributed**, **fast**. +Ready to try it? [Head over to Installation to get started.](/install) -<a href="https://datproject.org/install#desktop" target="_blank" title="Install Dat Desktop"><img src="/assets/install_desktop.png" alt="Install Dat Desktop" style="width:250px;"/></a> -<a href="https://datproject.org/install#terminal" target="_blank" title="Install dat command line"><img src="/assets/install_cli.png" alt="Install dat command line" style="width:250px;"/></a> +## Why Dat? -**Built for the Public Good** +When sharing files, current tools have tradeoffs: lower costs and ease of use, or security and speed. Cloud services, such as Dropbox or GitHub, force users to store data on places outside of their control. Until now, it has been very difficult to avoid centralized servers without major sacrifices. Dat's unique distributed network allows users to store data where they want. By decentralizing storage, Dat also increases speeds by downloading from many sources at the same time. -Dat's distributed team builds Dat openly and compassionately. All software is open source and freely available to use. The [Dat project](http://datproject.org) is led by [Code for Science & Society](http://codeforscience.org) (CSS), a U.S. nonprofit. The mission of CSS is to work with public institutions to produce open source infrastructure for researchers, civic hackers, and journalists. We want to improve data access and long-term preservation. We actively welcome outside contributors and use cases beyond our mission. +Having a history of how files have changed is essential for effective collaboration and reproducibility. Git has been promoted as a solution for history, but it becomes slow with large files and a high learning curve. Git is designed for editing source code, while Dat is designed for sharing files. With a few simple commands, you can version files of any size. People can instantly get the latest files or download previous versions. -Code for Science & Society hosts other open science initiatives including [Science Fair](https://github.com/codeforscience/sciencefair/), a desktop science library like nothing before, and [Stencila](https://github.com/stencila), the office suite for reproducible research. Science Fair uses Dat to distribute scientific literature. In the future, Stencila will use Dat for reproducible data analysis. +In sum, we've taken the best parts of Git, BitTorrent, and Dropbox to design Dat. Learn more about how it all works by learning our [key concepts](/concepts) or get more technical by reading [the Dat whitepaper](https://github.com/datproject/docs/blob/master/papers/dat-paper.pdf). -**Get in touch:** +#### Distributed Network -* [github.com/datproject](http://github.com/datproject) -* [@dat_project](http://twitter.com/dat_project) -* Chat in [#dat on IRC](http://webchat.freenode.net/?channels=dat) or via [gitter](https://gitter.im/datproject/discussions) +Dat works on a distributed network unlike cloud services, such as Dropbox or Google Drive. This means Dat transfers files peer to peer, skipping centralized servers. Dat's network makes file transfers faster and more secure. You can even use Dat on local networks for offline file sharing or local backups. Dat reduces bandwidth costs on popular files, as downloads are *distributed* across all available computers, rather than centralized on a single host. -## Getting Started +#### Data History -If you are new to Dat, welcome! You can learn more about Dat concepts in [the overview](overview). Becoming familiar with core Dat concepts will help you when using Dat and reading our documentation. +Dat makes it easy for you to save old versions of files. With every file update, Dat automatically tracks your changes. You can even direct these backups to be stored efficiently on an external hard drive or a cloud server by using [our archiver](/on-a-server). -If you are ready to get started, pick a Dat client and install! +#### Security -## Features +Dat transfers files over an encrypted connection using state-of-the-art cryptography. Only users with your unique link can access your files. Your dat link allows users to download and re-share your files. To write updates to a dat, users must have the secret key. Dat also verifies the hashes of files on download so no malicious content can be added. -* **Secure** - Dat encrypts data transfers and verifies content on arrival. Dat prevents third-party access to metadata and content. [Learn more](faq#security-and-privacy) about security & privacy. -* **Distributed** - Connect directly to other users sharing or downloading common datasets. Any device can share files without need for centralized servers. [Read more](terms#distributed-web) about the distributed web. -* **Fast** - Share files instantly with in-place archiving. Download only the files you want. Quickly sync updates by only downloading new data, saving time and bandwidth. -* **Transparent** - A complete version history improves transparency and auditability. Changes are written in append-only logs and uniformly shared throughout the network. -* **Future-proof** - Persistent links identify and verify content. These unique ids allow users to host copies, boosting long-term availability without sacrificing provenance. +## Who we are -## Installation +Dat is funded by [Code for Science & Society](https://codeforscience.org), a nonprofit supporting open source tools that benefit science and society. Dat also has a vibrant global community of developers building apps on the Dat protocol. - View the [installation guide](http://datproject.org/install) or pick your favorite client application: - -<a href="https://datproject.org/install#desktop" target="_blank" title="Install Dat Desktop"><img src="/assets/install_desktop.png" alt="Install Dat Desktop" style="width:250px;"/></a> -<a href="https://datproject.org/install#terminal" target="_blank" title="Install dat command line"><img src="/assets/install_cli.png" alt="Install dat command line" style="width:250px;"/></a> - -* [Beaker Browser](http://beakerbrowser.com) - An experimental p2p browser with built-in support for the Dat protocol. -* [Dat Protocol](https://www.datprotocol.com) - Build your own application on the Decentralized Archive Transport (Dat) protocol. -* [require('dat')](http://github.com/datproject/dat-node) - Node.js library for downloading and sharing Dat archives. +Enough reading, more doing? [Head over to Installation to get started.](/install) diff --git a/docs/overview.md b/docs/overview.md index fd70789..73cd73d 100644 --- a/docs/overview.md +++ b/docs/overview.md @@ -1,91 +1,17 @@ -# Dat Concept Overview +# About Dat -This overview will introduce you to Dat, a new type of distributed data tool, and help you make the most of it. Dat uses modern cryptography, decentralized networks, and content versioning so you can share and publish data with ease. +Dat's open source applications offer a new experience in advanced file syncing and publishing. Wherever your data goes, Dat uses innovative *in place archiving* to link files from many locations together. Share data with anyone over a distributed network using encrypted connections. Dat brings a new ease to public data management with automatic version history, persistent links, and dynamic storage. -With Dat, we want to make data sharing, publishing, and archiving fit into your workflow. Build with the needs of researchers, librarians, and developers in mind, Dat's unique design works wherever you store your data. You can keep files synced whether they're on your laptop, a public data repository, or in a drawer of hard drives. Dat securely ties all these places together, creating a dynamic data archive. Spend less time managing files, more time digging into data (unfortunately we cannot sort your hard drive drawer, yet). +Use Dat to distribute scientific data, browse remote files on demand, or run continuous file archiving. Integrate into your existing work flow with flexible storage options and http publishing. Dat connects existing web infrastructure with a modern technological foundation. Built on a decentralized network, Dat creates new opportunities for existing data publishing tools. Put data preservation at your finger tips, like never before, with user-first applications. **Secure**, **distributed**, **fast**. -**Install Dat now.** Then you can play with Dat while learning how it works! +With Dat, we want to make data sharing, publishing, and archiving fit into your workflow. Built with the needs of researchers, librarians, and developers in mind, Dat's unique design works wherever you store your data. You can keep files synced whether they're on your laptop, a public data repository, or in a drawer of hard drives. Dat securely ties all these places together, creating a dynamic data archive. Spend less time managing files, more time digging into data (unfortunately we cannot sort your hard drive drawer, yet). -<a href="https://datproject.org/install#desktop" target="_blank" title="Install Dat Desktop"><img src="/assets/install_desktop.png" alt="Install Dat Desktop" style="width:250px;"/></a> -<a href="https://datproject.org/install#terminal" target="_blank" title="Install dat command line"><img src="/assets/install_cli.png" alt="Install dat command line" style="width:250px;"/></a> +[If you'd like you read more about how dat works, see our whitepaper.](https://github.com/datproject/docs/blob/master/papers/dat-paper.pdf) -## In Place Archiving +## Built for the Public Good -You can turn any folder on your computer into *a dat*. We call this *in place archiving*. A dat is a regular folder with some magic attached. The magic is a set of metadata files, in a `.dat` folder. Dat uses the metadata to track file history and securely share your files. Your files and the `.dat` folder can be instantly synced to anywhere. +Dat's distributed team builds Dat openly and compassionately. All software is open source and freely available to use. The [Dat project](http://datproject.org) is led by [Code for Science & Society](http://codeforscience.org) (CSS), a U.S. nonprofit. The mission of CSS is to work with public institutions to produce open source infrastructure for researchers, civic hackers, and journalists. We want to improve data access and long-term preservation. We actively welcome outside contributors and use cases beyond our mission. -<img src="/assets/dat_folder.png" alt="Create a dat with any folder" style="width:500px;"/> +Code for Science & Society hosts other open science initiatives including [Science Fair](https://github.com/codeforscience/sciencefair/), a desktop science library like nothing before, and [Stencila](https://github.com/stencila), the office suite for reproducible research. Science Fair uses Dat to distribute scientific literature. In the future, Stencila will use Dat for reproducible data analysis. -Once installing Dat, you can use a single command to live sync your files to friends, backup to an external drive, and publish to a website (so people can download over http too!). The cool part is this all happens at the same time. If you go offline for a bit, no worries. Dat shares the latest files and any saved history once you are back online. These data transfers happen between the computers, forgoing any centralized source. - -In place archiving in Dat really means **any place**. Dat seamlessly syncs your files where you want and when you want. Dat's decentralized technology and automatic versioning will improve data availability and data quality without sacrificing ease of use. - -## Distributed Network - -Dat goes beyond regular archiving through it's *distributed network*. When you share data, Dat sends data to many download locations at once, and they can sync the same data with each other! By connecting users directly Dat transfers files faster, especially sharing on a local network. Distributed syncing allows robust global archiving for public data. - -<img src="/assets/share_link.png" alt="Share unique dat link" style="width:500px;"/> - -To maintain privacy, the dat link controls access to your data. Any data shared in the network is encrypted using your link as the password. Learn more about Dat's securtiy and privacy below or in [the faqs](faq#security-and-privacy). We are also investigating ways to improve [reader privacy](https://blog.datproject.org/2016/12/12/reader-privacy-on-the-p2p-web/) for public data. - -## Version History - -Dat automatically maintains a built in version history whenever files are added. Dat uses this history to allow partial downloads of files, for example only getting the latest files. There are two types of versioning performed automatically by Dat. Metadata is stored in a folder called `.dat` in the main folder of a repository, and data is stored as normal files in the main folder. - -Dat uses append-only registers to store version history. This means all changes are written to the end of the file, growing over time. - -### Metadata Versioning - -Dat acts as a one-to-one mirror of the state of a folder and all it's contents. When importing files, Dat grabs the filesystem metadata for each file and checks if there is already an entry for this filename. If the file with this metadata matches exactly the newest version of the file metadata stored in Dat, then this file will be skipped (no change). - -If the metadata differs or does not exist, then this new metadata entry will be appended as the new 'latest' version for this file in the append-only SLEEP metadata content register. - -### Content Versioning - -The metadata only tells you if or when a file is changed, now how it changed. In addition to the metadata, Dat tracks changes in the content in a similar manner. - -The default storage system used in Dat stores the files as files. This has the advantage of being very straightforward for users to understand, but the downside of not storing old versions of content by default. - -In contrast to other version control systems, like Git, Dat only stores the current set of files, not older versions. Git, for example, stores all previous content versions and all previous metadata versions in the `.git` folder. But Dat is designed for larger datasets. - -Storing all history on content could easily fill up the users hard drive. Dat has multiple storage modes based on usage. With Dat's dynamic storage, you can store the content history on a local external hard drive or on a remote server (or both!). - -## Dat Privacy - -Files shared with Dat are encrypted (using the link) so *only* users with your unique link can access your files. The link acts as a kind of password meaning, generally, you should assume *anyone* with the link will have access to your files. - -The link allows users to download, and re-share, your files, whether you intended them to have the link or not (with some hand waiving assumptions about them being able to connect to you, which can be limited, see more in [security & privacy faq](faq#security-and-privacy)). - -Make sure you are thoughtful about who you share links with and how. Dat ensures links cannot be intercepted through the Dat network. If you share your links over other channels, ensure the privacy & security matches or exceeds your data security needs. We try to limit times when Dat displays full links to avoid accidental sharing. - -## dat:// links - -Dat links have some special properties that are helpful to understand. - -Traditionally, http links point to a specific server, e.g. datproject.org's server, and/or a specific resource on that server. Unfortunately, links often break or the content changes without notification (this makes it impossible to cite `nytimes.com`, for example, because the link is meaningless without a reference to what content was there at citation time). - -You may have seen Dat links around: - -``` -dat://ff34725120b2f3c5bd5028e4f61d14a45a22af48a7b12126d5d588becde88a93 -``` - -What is with the weird long string of characters? Let's break it down! - -**`dat://` - the protocol** - -The first part of the link is the link protocol, Dat (read about the Dat protocol at [datprotocol.com](http://www.datprotocol.com)). The protocol describes what "language" the link is in and what type of applications can open it. You do not always need this part with Dat but it is helpful context. - -**`ff34725120b2f3c5bd5028e4f61d14a45a22af48a7b12126d5d588becde88a93` - the unique identifier** - -The second part of the link is a 64-character hex strings ([ed25519 public-keys](https://ed25519.cr.yp.to/) to be precise). Each Dat archive gets a public key link to identify it. With the hex string as a link we can a few things: - -1. Encrypt the data transfer -2. Create a persistent identifier, an ID that never changes, even as file are updated (as opposed to a checksum which is based on the file contents). - -**`dat://ff34725120b2f3c5bd5028e4f61d14a45a22af48a7b12126d5d588becde88a93`** - -All together, the links can be thought of similarly to a web URL, as a place to get content, but with some extra special properties. When you download a dat link: - -1. You do not have to worry about where the files are stored. -2. You can always get the latest files available. -3. You can view the version history or add version numbers to links to get an permanent link to a specific version. +If you were looking to install dat, you're in the wrong place -- [installation is on a different page!](/install). diff --git a/docs/terms.md b/docs/terms.md index a8aa52e..f0425ce 100644 --- a/docs/terms.md +++ b/docs/terms.md @@ -1,7 +1,5 @@ # Terminology -## Dat Terms - Terms specific to the Dat software. ### dat, Dat archive, archive @@ -10,10 +8,6 @@ A dat, or Dat archive, is a set of files and dat metadata (see [SLEEP](#sleep)). When you create a dat, you're creating a `.dat` folder to hold the metadata and the dat keys (a public and secret key). -### Dat Link or Dat Key - -Identifier for a dat, e.g. `dat://ab3ed4f...`. These are 64 character hashes with the `dat://` protocol prefix. Anyone with the Dat link can download and re-share files in a dat. - ### Secret Key Dat links are the public part of a key pair. Users that have the secret key are able to write updates to a dat. @@ -82,7 +76,7 @@ The discovery key is a hashed public key. The discovery key is used to find peer A feed is a term we use interchangeably with the term "append-only log". It’s the lowest level component of Dat. For each Dat, there are two feeds - the metadata and the content. -Feeds are created with hypercore. +Feeds are created with hypercore. ### Metadata Feed diff --git a/docs/troubleshooting.md b/docs/troubleshooting.md index f298a81..45e3fda 100644 --- a/docs/troubleshooting.md +++ b/docs/troubleshooting.md @@ -62,17 +62,10 @@ For direct connection tests, the doctor will print out a command to run on the o ## Installation Troubleshooting -### Dat Desktop - -TODO - -### Command Line - -To use the Dat command line tool you will need to have [node and npm installed](https://docs.npmjs.com/getting-started/installing-node). Make sure those are installed correctly before installing Dat. Dat only supports Node versions 4 and above. You can check the version of each: +To use the Dat command line tool you will need to have [node and npm installed](https://docs.npmjs.com/getting-started/installing-node). Make sure those are installed correctly before installing Dat. Dat only supports Node versions 4 and above. ``` node -v -npm -v ``` #### Global Install diff --git a/docs/tutorial.md b/docs/tutorial.md new file mode 100644 index 0000000..b451b03 --- /dev/null +++ b/docs/tutorial.md @@ -0,0 +1,74 @@ +# Getting Started with Dat + +This is a tutorial for the Dat command line tool. If you don't use the command line, don't worry. There is a desktop app that makes it easy for anyone to share and download data using Dat. Anyone using a Dat application will work, it does not matter which application they are using. [Download the Desktop Application](/install#desktop-application) + +In this tutorial we will go through the two main ways to use Dat, sharing data and downloading data. If possible, this is great to go through with a partner to see how Dat works across computers. + +## Features + +* **Secure** - Dat encrypts data transfers and verifies content on arrival. Dat prevents third-party access to metadata and content. [Learn more](faq#security-and-privacy) about security & privacy. +* **Distributed** - Connect directly to other users sharing or downloading common datasets. Any device can share files without need for centralized servers. [Read more](terms#distributed-web) about the distributed web. +* **Fast** - Share files instantly with in-place archiving. Download only the files you want. Quickly sync updates by only downloading new data, saving time and bandwidth. +* **Transparent** - A complete version history improves transparency and auditability. Changes are written in append-only logs and uniformly shared throughout the network. +* **Future-proof** - Persistent links identify and verify content. These unique ids allow users to host copies, boosting long-term availability without sacrificing provenance. + +## Installing Dat + +To install dat in the Terminal, use `npm install -g dat`. For more information, see the [Installation page](/install). + +## Downloading Data + +Similar to git, you do download somebody's dat by running `dat clone <link>`. A dat link is like an http:// link, but with special properties. + +As an example, we created a dat that you can download. It just contains a couple of small files. + +``` +dat clone dat://778f8d955175c92e4ced5e4f5563f69bfec0c86cc6f670352c457943666fe639 ~/Downloads/dat-demo +``` + +![clone](https://raw.githubusercontent.com/datproject/docs/master/assets/cli-clone.gif) + +This will download our demo files to the `~/Downloads/dat-demo` folder. These files are being shared by a server over Dat (to ensure high availability). When you download data, you may connect to any number of users who are running dat, too. The more users that are running dat the faster it downloads. + +You can also also view the files online: [https://datproject.org/778f8d955175c92e4ced5e4f5563f69bfec0c86cc6f670352c457943666fe639](https://datproject.org/778f8d955175c92e4ced5e4f5563f69bfec0c86cc6f670352c457943666fe639/). Datproject.org previews a dat in the browser -- as long as someone is hosting it. The website temporarily caches data for any visited links (i.e., do not view your dat on datproject.org if you do not want us caching your data). + +## Creating a Dat + +Now, let's share some data and create a dat from a folder on your computer. + +Find a folder on your computer to share. Any kind of files work with Dat but for now, make sure it's something you want to share. Dat can handle all sorts of files (Dat works with really big folders too!). We like cat pictures. + +First, you can create a new dat inside that folder. Using the `dat create` command initializes the dat and allows us to give it some information so that other people and applications can easily display what is in the dat. + +``` +dat create +> Title My Amazing Data +> Title My Awesome Dat +> Description This is a dat + +Created empty Dat in /Users/me/MyData/.dat +``` + +This will create a new (empty) dat. A folder called `.dat` is created, which contains a bunch of metadata files that keep the dat in sync. To learn more about what these files are, read the [Overview](/overview) or the [read the Dat paper](/paper). + +## Sharing data + +Your dat has been created, and now it's time to scan and sync the data to someone else. In the same folder, run the following command: + +``` +dat share +``` + +![share](https://raw.githubusercontent.com/datproject/docs/master/assets/cli-share.gif) + +As long as this process is running, you can share the link with your friend and they can instantly start downloading your files. + +If you don't want the other person to download dat, you can also send them a link and they can see the contents in the browser. Go to [http://datproject.org](https://datproject.org) and enter your link to preview on the top right. *(Some users, including me when writing this, may have trouble connecting to datproject.org initially. It might take some time to initially connect, but if you wait and refresh it should view the files. We are actively working on improving this performance. Thanks.)* + +## Keeping data alive + +Your data will be available on the network as long as the process is open. However, if you need to close your laptop or turn off the computer, you might want to host the dat for long-term on a server. + +First, you need to purchase a server on your own. We recommend using [Digital Ocean](digitalocean.com), or setting up a [data silo in your house](https://github.com/datproject/datasilo). + +Once you have a server available, head over to the [Running Dats on a Server section to automatically re-host your dat](/server). |