Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: atrium-repo #272

Open
wants to merge 43 commits into
base: main
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
43 commits
Select commit Hold shift + click to select a range
bfb8c17
chore: Create empty `atrium-repo` library crate
str4d May 7, 2024
682f0e4
repo: Logic for parsing and traversing MST nodes
str4d May 9, 2024
e33db08
repo: Indexed reader for CAR files
str4d May 11, 2024
cb49514
repo: APIs to fetch keys, collections, and records from a repository
str4d May 11, 2024
1d4c101
Significant refactor; implement basic MST update logic
DrChat Dec 21, 2024
d633238
Full MST implementation
DrChat Dec 22, 2024
5343362
More refactoring :)
DrChat Dec 23, 2024
e9532bb
Use `traverse` for `split_subtree`
DrChat Dec 25, 2024
4e2f052
Fix a few lingering bugs in node deletion
DrChat Dec 26, 2024
a1468de
Update firehose example
DrChat Dec 26, 2024
8ce976a
More validation in mst dual-layer deletion test
DrChat Dec 27, 2024
e80ee06
`Tree::get_path`
DrChat Dec 27, 2024
98604a3
Only verify SHA2_256 CAR blocks
DrChat Dec 27, 2024
752f073
Documentation pass
DrChat Dec 27, 2024
9b91158
MST: Handle edge case where we need to insert new tree entry at leftm…
DrChat Dec 27, 2024
bee8f13
Fix repo tests
DrChat Dec 27, 2024
f7fa5a9
More docs
DrChat Dec 28, 2024
63eb5e5
Simplify `Tree::keys`
DrChat Dec 30, 2024
daa6165
Return keys from `Tree::keys` in lexicographic order
DrChat Dec 30, 2024
381c304
Create a high-level `Commit` helper struct
DrChat Dec 30, 2024
16db1bd
Add new `Tree::entries` function to return key/value tuples
DrChat Dec 30, 2024
ce8e32b
Add documentation clarifying that tree iteration does not work agains…
DrChat Dec 30, 2024
d281bd6
Whoops :)
DrChat Dec 30, 2024
973b06c
Expose hashing algorithm to `AsyncBlockStoreWrite::write_block`
DrChat Dec 30, 2024
caa3681
Forgot to commit this import :(
DrChat Dec 30, 2024
cb7222c
Gracefully handle multi-writes to `CarStore`
DrChat Dec 30, 2024
b4e8a99
Rename `Repository::new` to `open`
DrChat Dec 31, 2024
5ce2e07
Outline `algos::compute_depth`
DrChat Jan 1, 2025
f7b910e
Correct the docs for `Node::find_ge`
DrChat Jan 1, 2025
9665d2b
Panic if we attempt to serialize a `Node` with two adjacent tree entries
DrChat Jan 1, 2025
91bee28
Misc. improvements
DrChat Jan 3, 2025
f01c0ba
More `Tree` documentation
DrChat Jan 3, 2025
f31f338
Remove `AsyncBlockStoreWrite::delete_block`
DrChat Jan 5, 2025
5cfc9c3
Fix a bug when writing out a new CAR block
DrChat Jan 5, 2025
4a921c1
API updates; Implement `CarStore::create`
DrChat Jan 12, 2025
50ca94d
Update firehose example to remove CID compat
DrChat Jan 12, 2025
baffd15
Move all algorithms to `algos` module
DrChat Jan 18, 2025
5f9edb4
Fix warnings
DrChat Jan 18, 2025
7612064
clippy + fmt
DrChat Jan 19, 2025
5da91d0
Full CRUD functionality for `Repository`
DrChat Jan 28, 2025
cb8e8c0
Add a basic differencing blockstore layer
DrChat Jan 31, 2025
ef20156
Remove `async-trait` dep
DrChat Feb 1, 2025
bbd9cd1
Add my name to `atrium-repo` authors
DrChat Feb 1, 2025
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
3 changes: 3 additions & 0 deletions .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -12,5 +12,8 @@ target/
# These are backup files generated by rustfmt
**/*.rs.bk

# VSCode settings
/.vscode
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I do not want to include anything related to a specific editor.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I added this line because VSCode annoyingly creates this folder automatically (likely due to the Copilot extension being broken and populating a settings.json).
If it isn't present, it's almost certain that I or someone else may accidentally commit this folder.

If you don't want to have a line that explicitly mentions VSCode, another alternative you could consider is changing this .gitignore file to an allowlist.
Here's an example from one of my other projects.

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Each individual's settings related to the editor should have their own global ignore settings, and I don't want to include changes on the repository side for that.

According to man gitignore:

Patterns which a user wants Git to ignore in all situations (e.g., backup or temporary files generated by the user’s editor of choice) generally go into a file specified by core.excludesFile in the user’s ~/.gitconfig. Its default value is $XDG_CONFIG_HOME/git/ignore. If $XDG_CONFIG_HOME is either not set or empty, $HOME/.config/git/ignore is used instead.

I believe that editor related ignore settings should be written there.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could I still persuade you to reconsider? :)
This is commonly done in many projects - for example, Rust itself has a .gitignore file checked in with this line present.

I do work with some projects that are exceptions to this rule and feel like a global setting would be more cumbersome than helpful.


# MSVC Windows builds of rustc generate these, which store debugging information
*.pdb
128 changes: 128 additions & 0 deletions Cargo.lock

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

20 changes: 15 additions & 5 deletions Cargo.toml
Original file line number Diff line number Diff line change
Expand Up @@ -3,18 +3,22 @@ members = [
"atrium-api",
"atrium-common",
"atrium-crypto",
"atrium-repo",
"atrium-xrpc",
"atrium-xrpc-client",
"atrium-oauth/identity",
"atrium-oauth/oauth-client",
"bsky-cli",
"bsky-sdk",

# "examples/concurrent",
"examples/firehose",
]
# Examples show how to use the latest published crates, not the workspace state.
exclude = [
"examples/concurrent",
"examples/firehose",
"examples/video",
# "examples/concurrent",
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there a reason why these are no longer excluded?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah - I had forgotten about this.
I changed this because the examples were written against the last released version of atrium rather than the version present in the repository.
I needed to update the firehose example to incorporate the changes I have made here, and chose to change its dependencies to point at the crates in this repository.

The decision to have examples demonstrate the last public release seems to be an unusual one. I haven't ever seen this be done for any other Rust library.
Any thoughts about changing this?

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Indeed. The current policy of using the latest release version does not allow us to use newly added packages.

We've discussed this before and made it the way it is now,
#100 (comment)

But maybe we should reconsider.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Indeed - I think it would be wise to reconsider :)

One potential alternative to allow people to find examples for the previously released version would be to create git tags upon release and instruct users to look for the corresponding tag.
It actually looks like release-plz is already creating these tags for you (though appears to be doing so per-crate).

# "examples/firehose",
# "examples/video",
]
resolver = "2"

Expand All @@ -27,13 +31,17 @@ keywords = ["atproto", "bluesky"]

[workspace.dependencies]
# Intra-workspace dependencies
atrium-api = { version = "0.24.10", path = "atrium-api", default-features = false }
atrium-api = { version = "0.24.10", path = "atrium-api" }
atrium-common = { version = "0.1.0", path = "atrium-common" }
atrium-crypto = { version = "0.1.2", path = "atrium-crypto" }
atrium-identity = { version = "0.1.0", path = "atrium-oauth/identity" }
atrium-xrpc = { version = "0.12.0", path = "atrium-xrpc" }
atrium-xrpc-client = { version = "0.5.10", path = "atrium-xrpc-client" }
bsky-sdk = { version = "0.1.15", path = "bsky-sdk" }

# async in streams
async-stream = "0.3"

# DAG-CBOR codec
ipld-core = { version = "0.4.1", default-features = false, features = ["std"] }
serde_ipld_dagcbor = { version = "0.6.0", default-features = false, features = ["std"] }
Expand All @@ -49,6 +57,7 @@ serde = "1.0.202"
serde_bytes = "0.11.9"
serde_html_form = "0.2.6"
serde_json = "1.0.125"
unsigned-varint = "0.8"

# Cryptography
ecdsa = "0.16.9"
Expand All @@ -68,7 +77,8 @@ hickory-resolver = "0.24.1"
http = "1.1.0"
lru = "0.12.4"
moka = "0.12.8"
tokio = { version = "1.39", default-features = false }
tokio = { version = "1.39", default-features = false, features = ["io-util"] }
tokio-util = "0.7"

# HTTP client integrations
isahc = "1.7.2"
Expand Down
37 changes: 36 additions & 1 deletion atrium-api/src/types/string.rs
Original file line number Diff line number Diff line change
Expand Up @@ -9,6 +9,22 @@ use regex::Regex;
use serde::{de::Error, Deserialize, Deserializer, Serialize, Serializer};
use std::{cmp, ops::Deref, str::FromStr, sync::OnceLock};

// Reference: https://github.com/bluesky-social/indigo/blob/9e3b84fdbb20ca4ac397a549e1c176b308f7a6e1/repo/tid.go#L11-L19
fn s32_encode(mut i: u64) -> String {
const S32_CHAR: &[u8] = b"234567abcdefghijklmnopqrstuvwxyz";

let mut s = String::new();
for _ in 0..13 {
let c = i & 0x1F;
s.push(S32_CHAR[c as usize] as char);

i >>= 5;
}

// Reverse the string to convert it to big-endian format.
s.as_str().chars().rev().collect()
}

/// Common trait implementations for Lexicon string formats that are newtype wrappers
/// around `String`.
macro_rules! string_newtype {
Expand Down Expand Up @@ -410,7 +426,7 @@ impl Serialize for Language {

/// A [Timestamp Identifier].
///
/// [Timestamp Identifier]: https://atproto.com/specs/record-key#record-key-type-tid
/// [Timestamp Identifier]: https://atproto.com/specs/tid
#[derive(Clone, Debug, PartialEq, Eq, Serialize, Hash)]
#[serde(transparent)]
pub struct Tid(String);
Expand All @@ -436,6 +452,19 @@ impl Tid {
}
}

/// Construct a new timestamp with the specified clock ID.
///
/// Clock IDs 0-31 can be used as an ad-hoc clock ID if you are not concerned
/// with this parameter.
pub fn now(cid: u32) -> Self {
let now = chrono::Utc::now().timestamp_micros() as u64;

// The TID is laid out as follows:
// 0TTTTTTTTTTTTTTT TTTTTTTTTTTTTTTT TTTTTTTTTTTTTTTT TTTTTTCCCCCCCCCC
let tid = (now << 10) & 0x7FFF_FFFF_FFFF_FC00 | (cid as u64) & 0x3FF;
Self(s32_encode(tid))
}

/// Returns the TID as a string slice.
pub fn as_str(&self) -> &str {
self.0.as_str()
Expand Down Expand Up @@ -766,6 +795,12 @@ mod tests {
}
}

#[test]
fn tid_encode() {
assert_eq!(s32_encode(0), "2222222222222");
assert_eq!(s32_encode(1), "2222222222223");
}

#[test]
fn valid_tid() {
for valid in ["3jzfcijpj2z2a", "7777777777777", "3zzzzzzzzzzzz"] {
Expand Down
9 changes: 9 additions & 0 deletions atrium-repo/CHANGELOG.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,9 @@
# Changelog
All notable changes to this project will be documented in this file.

The format is based on [Keep a Changelog](https://keepachangelog.com/en/1.0.0/),
and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0.html).

## [Unreleased]

Initial release.
Loading