Begin transmission…
Dealing with tRPC Type Incompatibilities
by Joe Wagner
If you’re not familiar with tRPC, it allows you to easily build & consume fully type-safe APIs without schemas or code generation. There was an interesting bug that had to do with type incompatibilities between patch versions of a Studio dependency—the Studio is a Tableland web app we’re currently building to let you create projects and manage tables from a UI.
We had to create an Adapter ****for the tRPC package that mapped node.js http request-response cycles to Fetch API request-response cycles. It might not be the perfect solution, but it resolved the issue. Here’s a snippet of what we did:
async function startStudioApi({ store }: { store: Store; }) { const apiRouter = appRouter( store, process.env.POSTMARK_API_KEY!, (seal: string) => `${TEST_API_BASE_URL}/invite?seal=${seal}`, process.env.DATA_SEAL_PASS!, ); // Create a local server to receive data from. const apiServer = http.createServer(async function (req: any, res: any) { // When we receive a request, the first step is to map the Node.js // request Object to an Object that looks like a fetch request. try { req.url = `${TEST_API_BASE_URL}${req.url}`; req.headers = new Headers(req.headers); req.text = function () { return new Promise(function (resolve, reject) { const body: any[] = []; req .on('data', (chunk: any) => { body.push(chunk); }) .on('end', () => { resolve(Buffer.concat(body).toString()); }); }); }; // Now that we have a request Object that meets the needs of the // fetch request handler we can use the web app's adapter. const response = await fetchRequestHandler({ endpoint: "/api/trpc", req, router: apiRouter, createContext, }); // In order to respond to the original node.js style request we // map the fetch response to a node.js response. const responseHeaders = Object.fromEntries(response.headers.entries()) res.writeHead(response.status, responseHeaders); // using `as unknown as` because of <https://github.com/DefinitelyTyped/DefinitelyTyped/discussions/62651> const body = response.body ? await streamToString(response.body as unknown as NodeJS.ReadableStream) : ""; res.end(body); } catch (err: any) { console.log(err); res.writeHead(500, { 'Content-Type': 'application/json' }); res.end( `{"error":{"json":{"message":"${err.message}","code":-32603,"data":{"code":"INTERNAL_SERVER_ERROR","httpStatus":500}}}}` ); } }); return apiServer; }
JSON Parsing in Go Across WASM Bridge Takes a Lot of Finesses
Our SQL parser is written in Go and we compile it to WASM (via tinygo) for use in browsers and other Javascript environments. So far, we haven’t included direct access to the underlying abstract syntax tree (AST) via WASM because, well frankly it would have been a lot of work to expose of those methods and types across the WASM “bridge”. One of the big hurdles was that tinygo didn’t have very good support for encoding/json
and reflection. But recently, tinygo added better support for this, so we thought it might be a good opportunity to provide lower-level access to our AST from WASM (plus our own team wanted this feature). While the process of updating and upgrading tinygo etc was pretty painless, adding the encoding/json
dependency and leveraging the JSON marshaling functionality from Go added a lot to our WASM bundle size — almost double the size for only a little bit more functionality! Now, I didn’t spend much time optimizing things, but in addition to the larger size, our use of interfaces in the parser actually made encoding/decoding to JSON a real pain to work with. To hack around the fact that we’re dealing with polymorphic JSON, I tried leveraging type discriminator fields, embedding structs, and more. But in the end, this was one case where the engineering effort required to get things working the way we wanted wasn’t going to be worth it. Lesson learned, and I now know a lot more about Go reflection, JSON parsing, and WASM… so that’s still a win!
Actually, Set Homomorphic Hashing is Here to Stay!
As part of project “Basin”, we're moving forward with prototyping an incremental update propagation scheme based on homomorphic hashing. The gist is that homomorphic hashing allows one to update a hash value based on incremental changes to the underlying data, without needing to rehash everything. You can read more about this from last week’s post. But I also wanted to share something that myself and Bruno Calza and Avichal Pandey got excited about this week. Imagine this simple scenario:
You have some database state that you are updating locally, and you also want a remote service to replicate that state for you (i.e., an outsourced database service). Now also imagine that you trust the service enough to store your data with them, but you don’t want your downstream consumers (e.g., apps, users, etc) to have to make the same trust assumptions.
This is a pretty common scenario in the web3 space. You might call it a reduced trust system (rather than a trustless system)? The general idea here is, the fact that I’m using a backup service or 3rd party storage solution is an implementation detail of my system that shouldn’t affect my users. So how can we set things up so that updates via my remote service are transparent and can be treated as if they came directly from me?
Often this is done using Merkle trees and Merkle inclusion proofs. But, a Merkle tree with n leaves has O(log2 n)-sized proofs (and we need to maintain 2n - 1 hashes). So for lots of data (large trees), sending the proofs can dominate bandwidth consumption, whereas for small data it isn’t so bad! Things do get worse when we want to test inclusion of multiple values, or we want our Merkle structure to be updatable (which we do). For example, if we’re using a binary numeral tree to maintain a grow-only set of updates, a multi-range proof of k ranges evenly distributed throughout the tree will require k(log2 n/k) hashes. So for a 100-element multi-proof over a database with say 100,000 elements we’d generate a proof with just shy of 1000 elements. So for a 32-byte hash, that’s about 30 KB. There are techniques and special cases where we can do a lot better or a lot worse! But what if there was a one-size-fits-all solution (i.e., fixed size storage and proof costs)? This is a perfect job for homomorphic hashing! Here’s how it might work in one tiny Rust program with comments:
use lthash_rs::LtHash; use sha3::{ digest::{ExtendableOutput, Update, XofReader}, Shake128, }; fn main() { // We're using Shake128 as our base XOF hash function for this demo type LtHash16 = lthash_rs::LtHash16<Shake128>; // Start with a "empty" hash state let mut lthash = LtHash16::new(); // I (data producer) update my database locally let elements = ["apple", "banana", "kiwi"]; // And add the updates to the hash state lthash.insert(elements[0]); lthash.insert(elements[1]); lthash.insert(elements[2]); // I also send updates to the server (not shown) // But I do _not_ send any hashes to the server // Meanwhile, on the server... they aren't behaving let mut server = LtHash16::new(); server.insert("cheese"); // Now I want to make sure the state on the server is correct // This is a fixed size (and can include many elements) let mut element_one = LtHash16::new(); element_one.insert(elements[1]); // I send this to the server and ask for a proof // On the server they compute the "difference" let proof = server.difference(&element_one); // At my end (the client) I compute the "union" let union = proof.union(&element_one); // Now I can check the proof, which will panic in this case assert_eq!(union.to_hex_string(), lthash.to_hex_string()); }
I’m leaving out some important details about signatures, posting hashes to a publicly accessible verification oracle, and things like that. But the TL;DR is we can compute arbitrarily large inclusion proofs with fixed size hashes and some really simple math! Magic 🧙.
GitHub Actions & Short-Lived Google Cloud Credentials
CI pipelines often interact with components from multiple cloud providers. Consider an example of an integration that you run on GitHub Actions infrastructure. The test case needs to put a file in Google Cloud's Storage bucket. One way to authenticate your Github Actions job is to place your credentials file in the repository secrets.
It gives Github access to your bucket in the Google Cloud forever. However, giving full access to your resources to a third-party infrastructure provider is not a good security practice. If your GitHub account gets compromised, the attacker can also access your Google Cloud bucket.
A better approach is to use the OpenID Connect protocol. OIDC allows your GitHub Actions workflows to access Google Cloud resources using a GCP IAM policy. It also doesn't require storing the GCP credentials as long-lived GitHub secrets. The auth Github Actions module uses the workload identity federation offered by GCP to fetch a short-lived JWT token. This token gives temporary access to a Google Cloud resource you need for a particular workflow. To implement it in your Github Action workflows, you can follow this guide
Here is an example of a simple GitHub Actions workflow that authenticates on behalf of the developer and lists all GCP services.
name: List services in GCP on: pull_request: branches: - main permissions: id-token: write ### <= This is required for requesting the JWT jobs: Get_OIDC_ID_token: runs-on: ubuntu-latest steps: - id: 'auth' ### <= fetch ODIC token for this workflow name: 'Authenticate to GCP' uses: 'google-github-actions/[email protected]' with: create_credentials_file: 'true' workload_identity_provider: '<your gcp identity pool provider>' service_account: '<your gcp service account>' - id: 'gcloud' name: 'gcloud' run: |- gcloud auth login --brief --cred-file="${{ steps.auth.outputs.credentials_file_path }}" gcloud services list
Other Updates This Week
We’ll be launching our Studio application ahead of ETHOnline, so keep an eye out for access & announcement tweets!
We were at the FIL Dev Summit last week in Iceland (well, Andrew Hill was!).
We’re also excited to be at the IPFS Connect event, and we're looking for folks who want to present up-and-coming work or the next big thing during our lightning talk rounds. Reach out if this sounds interesting.
Our Mission Board has been activated, letting community members contribute and get rewarded Flight Time for their contributions.
We worked on crypto taxes & Filecoin reporting…thanks to Kruze for helping us through this!
End transmission…
Want to dive deeper, ask questions, or just nerd out with us? Jump into our weekly research office hours. Or, for hands-on technical support, you can join our weekly developer office hours.