1758 stories
·
120 followers

Real URLs for AMP Cached Content Using Cloudflare Workers

1 Share
Real URLs for AMP Cached Content Using Cloudflare Workers
Real URLs for AMP Cached Content Using Cloudflare Workers

Today, we’re excited to announce our solution for arguably the biggest issue affecting Accelerated Mobile Pages (AMP): the inability to use real origin URLs when serving AMP-cached content. To allow AMP caches to serve content under its origin URL, we implemented HTTP signed exchanges, which extend authenticity and integrity to content cached and served on behalf of a publisher. This logic lives on Cloudflare Workers, meaning that adding HTTP signed exchanges to your content is just a simple Workers application away. Publishers on Cloudflare can now take advantage of AMP performance and have AMP caches serve content with their origin URLs. We're thrilled to use Workers as a core component of this solution.

HTTP signed exchanges are a crucial component of the emerging Web Packaging standard, a set of protocols used to package websites for distribution through optimized delivery systems like Google AMP. This announcement comes just in time for Chrome Dev Summit 2018, where our colleague Rustam Lalkaka spoke about our efforts to advance the Web Packaging standard.

What is Web Packaging and Why Does it Matter?

You may already see the need for Web Packaging on a daily basis. On your smartphone, perhaps you’ve searched for Christmas greens, visited 1-800-Flowers directly from Google, and have been surprised to see content served under the URL https://google.com/amp/1800flowers.com/blog/flower-facts/types-of-christmas-greens/amp. This is an instance of AMP in action, where Google serves cached content so your desired web page loads faster.

Real URLs for AMP Cached Content Using Cloudflare Workers
Visiting 1-800 Flowers through AMP without HTTP signed exchange

Google cannot serve cached content under publisher URLs for clear security reasons. To securely present content from a URL, a TLS certificate for its domain is required. Google cannot provide 1-800-Flowers’ certificate on the vendor’s behalf, because it does not have the corresponding private key. Additionally, Google cannot, and should not be able to, sign content using the private key that corresponds to 1-800-Flowers’ certificate.

The inability to use original content URLs with AMP posed some serious issues. First, the google.com/amp URL prefix can strip URLs of their meaning. To the frustration of publishers, their content is no longer directly attributed to them by a URL (let alone a certificate). The publisher can no longer prove the integrity and authenticity of content served on their behalf.

Second, for web browsers the lack of a publisher’s URL can call the integrity and authenticity of a cached webpage into question. Namely, there’s no clear way to prove that this response is a cached version of an actual page published by 1-800-Flowers. Additionally, cookies are managed by third-party providers like Google instead of the publisher.

Enter Web Packaging, a collection of specifications for “packaging” website content with information like certificates and their validity. The HTTP signed exchanges specification allows third-party caches to cache and service HTTPS requests with proof of integrity and authenticity.

HTTP Signed Exchanges: Extending Trust with Cryptography

In the pre-AMP days, people expected to find a webpage’s content at one definitive URL. The publisher, who owns the domain of the definitive URL, would present a visitor with a certificate that corresponds to this domain and contains a public key.

Real URLs for AMP Cached Content Using Cloudflare Workers

The publisher would use the corresponding private key to sign a cryptographic handshake, which is used to derive shared symmetric keys that are used to encrypt the content and protect its integrity.

Real URLs for AMP Cached Content Using Cloudflare Workers

The visitor would then receive content encrypted and signed by the shared key.

Real URLs for AMP Cached Content Using Cloudflare Workers

The visitor’s browser then uses the shared key to verify the response’s signature and, in turn, the authenticity and integrity of the content received.

Real URLs for AMP Cached Content Using Cloudflare Workers

With services like AMP, however, online content may correspond to more than one URL. This introduces a problem: while only one domain actually corresponds to the webpage’s publisher, multiple domains can be responsible for serving a webpage. If a publisher allows AMP services to cache and serve their webpages, they must be able to sign their content even when AMP caches serve it for them. Only then can AMP-cached content prove its legitimacy.

Real URLs for AMP Cached Content Using Cloudflare Workers

HTTP signed exchanges directly address the problem of extending publisher signatures to services like AMP. This IETF draft specifies how publishers may sign an HTTP request/response pair (an exchange). With a signed exchange, the publisher can assure the integrity and authenticity of a response to a specific request even before the client makes the request. Given a signed exchange, the publisher authorizes intermediates (like Google’s AMP Cache) to forward the exchanges; the intermediate responds to a given request with the corresponding response in the signed HTTP request/response pair. A browser can then verify the exchange signature to assert the intermediate response’s integrity and authenticity.

This is like handing out an answer key to a quiz signed by the instructor. Having a signed answer sheet is just as good as getting the answer from the teacher in real time.

The Technical Details

An HTTP signed exchange is generated by the following steps.
First, the publisher uses MICE (Merkle Integrity Content Encoding) to provide a concise proof of integrity for the response included in the exchange. To start, the response is split into blocks of some record size bits long. Take, for example, a message ABCD, which is divided into record-size blocks A, B, C, and D. The first step to constructing a proof of integrity is to take the last block, D, and compute the following:

proof(D) = SHA-256(D || 0x0)

This produces proof(D). Then, all consequent proof values for blocks are computed as follows:

proof(C) = SHA-256(C || proof(D) || 0x1)
proof(B) = SHA-256(B || proof(C) || 0x1)
proof(A) = SHA-256(A || proof(B) || 0x1)

The generation of these proofs build the following tree:

      proof(A)
         /\
        /  \
       /    \
      A    proof(B)
            /\
           /  \
          /    \
         B    proof(C)
                /\
               /  \
              /    \
             C    proof(D)
                    |
                    |
                    D


As such, proof(A) is a 256-bit digest that a person who receives the real response should be able to recompute for themselves. If a recipient can recompute a tree head value identical to proof(A), they can verify the integrity of the response they received. In fact, this digest plays a similar role to the tree head of a Merkle Tree, which is recomputed and compared to the presented tree head to verify the membership of a particular node. The MICE-generated digest is stored in the Digest header of the response.

Next, the publisher serializes the headers and payloads of a request/response pair into CBOR (Concise Binary Object Representation). CBOR’s key-value storage is structurally similar to JSON, but creates smaller message sizes.

Finally, the publisher signs the CBOR-encoded request/response pair using the private key associated with the publisher’s certificate. This becomes the value of the sig parameter in the HTTP signed exchange.

The final HTTP signed exchange appears like the following:

sig=*MEUCIQDXlI2gN3RNBlgFiuRNFpZXcDIaUpX6HIEwcZEc0cZYLAIga9DsVOMM+g5YpwEBdGW3sS+bvnmAJJiSMwhuBdqp5UY=*;  
integrity="digest/mi-sha256";  
validity-url="https://example.com/resource.validity.1511128380";  
cert-url="https://example.com/oldcerts";  
cert-sha256=*W7uB969dFW3Mb5ZefPS9Tq5ZbH5iSmOILpjv2qEArmI=*;  
date=1511128380; expires=1511733180

Services like AMP can send signed exchanges by using a new HTTP response format that includes the signature above in addition to the original response.

Real URLs for AMP Cached Content Using Cloudflare Workers

When this signature is included in an AMP-cached response, a browser can verify the legitimacy of this response. First, the browser confirms that the certificate provided in cert-url corresponds to the request’s domain and is still valid. It next uses the certificate’s public key, as well as the headers and body values of request/response pair, to check the authenticity of the signature, sig. The browser then checks the integrity of the response using the given integrity algorithm, digest/mi-sha256 (aka MICE), and the contents of the Digest header. Now the browser can confirm that a response provided by a third party has the integrity and authenticity of the content’s original publisher.

After all this behind-the-scenes work, the browser can now present the original URL of the content instead of one prefixed by google.com/amp. Yippee to solving one of AMP’s most substantial pain points!

Generating HTTP Signed Exchanges with Workers

From the overview above, the process of generating an HTTP signed exchange is clearly involved. What if there were a way to automate the generation of HTTP signed exchanges and have services like AMP automatically pick them up? With Cloudflare Workers… we found a way you could have your HTTP origin exchange cake and eat it too!

We have already implemented HTTP signed exchanges for one of our customers, 1-800-Flowers. Code deployed in a Cloudflare Worker is responsible for fetching and generating information necessary to create this HTTP signed exchange.

This Worker works with Google AMP’s automatic caching. When Google’s search crawler crawls a site, it will ask for a signed exchange from the same URL if it initially responds with Vary: AMP-Cache-Transform. Our HTTP signed exchange Worker checks if we can generate a signed exchange and if the current document is valid AMP. If it is, that Vary header is returned. After Google’s crawler sees this Vary response, it will send another response with the following two headers:

AMP-Cache-Transform: google
Accept: application/signed-exchange;v=b2

When our implementation sees these header values, it will attempt to generate and return an HTTP response with Content-Type: application/signed-exchange;v=b2.

Now that Google has cached this page with the signed exchange produced by our Worker, the requested page will appear with the publisher’s URL instead of Google’s AMP Cache URL. Success!

If you’d like to see HTTP signed exchanges in action on 1-800-Flowers, follow these steps:

  1. Install/open Chrome Beta for Android. (It should be version 71+).
  2. Go to goo.gl/webpackagedemo.
  3. Search for “Christmas greens.”
  4. Click on the 1-800-Flowers link -- it should be about 3 spots down with the AMP icon next to it. Along the way to getting there you should see a blue box that says "Results with the AMP icon use web packaging technology." If you see a different message, double check that you are using the correct Chrome Beta.
    An example of AMP in action for 1-800-Flowers:

Real URLs for AMP Cached Content Using Cloudflare Workers
Visiting 1-800 Flowers through AMP with HTTP signed exchange

The Future: Deploying HTTP Signed Exchanges as a Worker App

Phew. There’s clearly a lot of infrastructure for publishers to build for distributing AMP content. Thankfully Cloudflare has one of the largest networks in the world, and we now have the ability to execute JavaScript at the edge with Cloudflare Workers. We have developed a prototype Worker that generates these exchanges, on the fly, for any domain. If you’d like to start experimenting with signed exchanges, we’d love to talk!

Soon, we will release this as a Cloudflare Worker application to our AMP customers. We’re excited to bring a better AMP experience to internet users and advance the Web Packaging standard. Stay tuned!

The Big Picture

Web Packaging is not simply a technology that helps fix the URL for AMP pages, it’s a fundamental shift in the way that publishing works online. For the entire history of the web up until this point, publishers have relied on transport layer security (TLS) to ensure that the content that they send to readers is authentic. TLS is great for protecting communication from attackers but it does not provide any public verifiability. This means that if a website serves a specific piece of content to a specific user, that user has no way of proving that to the outside world. This is problematic when it comes to efforts to archive the web.

Services like the Internet Archive crawl websites and keep a copy of what the website returns, but who’s to say they haven’t modified it? And who’s to say that the site didn’t serve a different version of the site to the crawler than it did to a set of readers? Web Packaging fixes this issue by allowing sites to digitally sign the actual content, not just the cryptographic keys used to transport data. This subtle change enables a profoundly new ability that we never knew we needed: the ability to record and archive content on the Internet in a trustworthy way. This ability is something that is lacking in the field of online publishing. If Web Packaging takes off as a general technology, it could be the first step in creating a trusted digital record for future generations to look back on.

Excited about the future of Web Packaging and AMP? Check out Cloudflare Ampersand to see how we're implementing this future.

Read the whole story
mithrandir
2124 days ago
reply
Share this story
Delete

Indirect Detection

8 Comments and 18 Shares
I'm like a prisoner in Plato's Cave, seeing only the shade you throw on the wall.
Read the whole story
mithrandir
2126 days ago
reply
popular
2126 days ago
reply
Share this story
Delete
8 public comments
effingunicorns
2126 days ago
reply
this is how I find out about wank on tumblr--no wait, I'm sorry, "discourse".
Covarr
2126 days ago
reply
I had a friend a couple months ago ranting about how awful the pro-pedophilia movement was, and all I could think was "what pro-pedophilia movement?"
East Helena, MT
adial29
979 days ago
bizarre
chrisamico
2126 days ago
reply
This is pretty much the social web in 2018.
Boston, MA
rraszews
2126 days ago
reply
Fred Clark, the Slacktivist, has written a bunch of times before about the "Anti-Kitten-Burning Coalition". Long story short, probably no one is burning kittens or hunting shelter animals for sport; claiming such (and in many cases, convincing yourself you believe it too) is a way to make yourself feel like a hero for opposing something evil (Without having to do much work, since you can't actually go out there and fight the kitten-burners as said burners do not exist), and get other people to sign on to support your side because otherwise they're siding with the kitten-burners.
Columbia, MD
corjen
2126 days ago
reply
Sharing for the alt text.
Iowa
ireuben
2127 days ago
reply
I totally thought this was going to be a “my hobby is...” post (or maybe that’s just what the friend is doing!).
alt_text_at_your_service
2127 days ago
reply
I'm like a prisoner in Plato's Cave, seeing only the shade you throw on the wall.
alt_text_bot
2127 days ago
reply
I'm like a prisoner in Plato's Cave, seeing only the shade you throw on the wall.

After NLL: Moving from borrowed data and the sentinel pattern

1 Share

Continuing on with my “After NLL” series, I want to look at another common error that I see and its solution: today’s choice is about moves from borrowed data and the Sentinel Pattern that can be used to enable them.

The problem

Sometimes when we have &mut access to a struct, we have a need to temporarily take ownership of some of its fields. Usually what happens is that we want to move out from a field, construct something new using the old value, and then replace it. So for example imagine we have a type Chain, which implements a simple linked list:

enum Chain {
  Empty,
  Link(Box<Chain>),
}

impl Chain {
  fn with(next: Chain) -> Chain {
    Chain::Link(Box::new(next))
  }
}

Now suppose we have a struct MyStruct and we are trying to add a link to our chain; we might have something like:

struct MyStruct {
  counter: u32,
  chain: Chain,
}

impl MyStruct {
  fn add_link(&mut self) {
    self.chain = Chain::with(self.chain);
  }
}

Now, if we try to run this code, we will receive the following error:

error[E0507]: cannot move out of borrowed content
 --> ex1.rs:7:30
  |
7 |     self.chain = Chain::with(self.chain);
  |                              ^^^^ cannot move out of borrowed content

The problem here is that we need to take ownership of self.chain, but you can only take ownership of things that you own. In this case, we only have borrowed access to self, because add_link is declared as &mut self.

To put this as an analogy, it is as if you had borrowed a really nifty Lego building that your friend made so you could admire it. Then, later, you are building your own Lego thing and you realize you would like to take some of the pieces from their building and put them into yours. But you can’t do that – those pieces belong to your friend, not you, and that would leave a hole in their building.

Still, this is kind of annoying – after all, if we look at the larger context, although we are moving self.chain, we are going to replace it shortly thereafter. So maybe it’s more like – we want to take some blocks from our friend’s Lego building, but not to put them into our own building. Rather, we were going to take it apart, build up something new with a few extra blocks, and then put that new thing back in the same spot – so, by the time they see their building again, the “hole” will be all patched up.

Root of the problem: panics

You can imagine us doing a static analysis that permits you to take ownership of &mut borrowed data, as long as we can see that it will be replaced before the function returns. There is one little niggly problem though: can we be really sure that we are going to replace self.chain? It turns out that we can’t, because of the possibility of panics.

To see what I mean, let’s take that troublesome line and expand it out so we can see all the hidden steps. The original line was this:

self.chain = Chain::with(self.chain);

which we can expand to something like this:

let tmp0 = self.chain;        // 1. move `self.chain` out
let tmp1 = Chain::with(tmp0); // 2. build new link
self.chain = tmp1;            // 3. replace with `tmp2`

Written this way, we can see that in between moving self.chain out and replacing it, there is a function call: Chain::with. And of course it is possible for this function call to panic, at least in principle. If it were to panic, then the stack would start unwinding, and we would never get to step 3, where we assign self.chain again. This means that there might be a destructor somewhere along the way that goes to inspect self – if it were to try to access self.chain, it would just find uninitialized memory. Or, even worse, self might be located inside of some sort of Mutex or something else, so even if our thread panics, other threads might observe the hole.

To return to our Lego analogy1, it is as if – after we removed some pieces from our friends Lego set – our parents came and made us go to bed before we were able to finish the replacement piece. Worse, our friend’s parents came over during the night to pick up the set, and so now when our friend gets it back, it has this big hole in it.

One solution: sentinel

In fact, there is a way to move out from an &mut pointer – you can use the function std::mem::replace2. replace sidesteps the panic problem we just described because it requires you to already have a new value at hand, so that we can move out from self.chain and immediately put a replacement there.

Our problem here is that we need to do the move before we can construct the replacement we want. So, one solution then is that we can put some temporary, dummy value in that spot. I call this a sentinel value – because it’s some kind of special value. In this particular case, one easy way to get the code to compile would be to stuff in an empty chain temporarily:

let chain = std::mem::replace(&mut self.chain, Chain::Empty);
self.chain = Chain::with(chain);

Now the compiler is happy – after all, even if Chain::with panics, it’s not a memory safety problem. If anybody happens to inspect self.chain later, they won’t see uninitialized memory, they will see an empty chain.

To return to our Lego analogy3, it’s as if, when we remove the pieces from our friend’s Lego set, we immediately stuff in a a replacement piece. It’s an ugly piece, with the wrong color and everything, but it’s ok – because our friend will never see it.

A more robust sentinel

The compiler is happy, but are we happy? Perhaps we are, but there is one niggling detail. We wanted this empty chain to be a kind of “temporary value” that nobody ever observes – but can we be sure of that? Actually, in this particular example, we can be fairly sure… other than the possibility of panic (which certainly remains, but is perhaps acceptable, since we are in the process of tearing things down), there isn’t really much else that can happen before self.chain is replaced.

But often we are in a situation where we need to take temporary ownership and then invoke other self methods. Now, perhaps we expect that these methods will never read from self.chain – in other words, we have a kind of interprocedural conflict. For example, maybe to construct the new chain we invoke self.extend_chain instead, which reads self.counter and creates that many new links4 in the chain:

impl MyStruct {
  fn add_link(&mut self) {
    let chain = std::mem::replace(&mut self.chain, Chain::Empty);
    let new_chain = self.extend_chain(chain);
    self.chain = new_chain;
  }
  
  fn extend_chain(&mut self, chain: Chain) -> Chain {
    for _ in 0 .. self.counter {
      chain = Chain::with(chain);
    }
    chain
  }
}

Now I would get a bit nervous. I think nobody ever observes this empty chain, but how can I be sure? At some point, you would like to test this hypothesis.

One solution here is to use a sentinel value that is otherwise invalid. For example, I could change my chain field to store an Option<Chain>, with the invariant that self.chain should always be Some, because if I ever observe a None, it means that add_link is in progress. In fact, there is a handy method on Option called take that makes this quite easy to do:

struct MyStruct {
  counter: u32,
  chain: Option<Chain>, // <-- new
}

impl MyStruct {
  fn add_link(&mut self) {
    // Equivalent to:
    // let link = std::mem::replace(&mut self.chain, None).unwrap();
    let link = self.chain.take().unwrap();
    self.chain = Some(Chain::with(self.chain));
  }
}

Now, if I were to (for example) invoke add_link recursively, I would get a panic, so I would at least be alerted to the problem.

The annoying part about this pattern is that I have to “acknowledge” it every time I reference self.chain. In fact, we already saw that in the code above, since we had to wrap the new value with Some when assigning to self.chain. Similarly, to borrow the chain, we can’t just do &self.chain, but instead we have to do something like self.chain.as_ref().unwrap(), as in the example below, which counts the links in the chain:

impl MyStruct {
  fn count_chain(&self) -> usize {
    let mut links = 0;
    let mut cursor: &Chain = self.chain.as_ref().unwrap();
    loop {
      match cursor {
        Chain::Empty => return links,
        Chain::Link(c) => {
          links += 1;
          cursor = c;
        }
      }
    }
  }
}

So, the pro of using Option is that we get stronger error detection. The con is that we have an ergonomic penalty.

Observation: most collections do not allocate when empty

One important detail when mucking about with sentinels: creating an empty collection is generally “free” in Rust, at least for the standard library. This is important because I find that the fields I wish to move from are often collections of some kind or another. Indeed, even in our motivating example here, the Chain::Empty sentinel is an “empty” collection of sorts – but if the field you wish to move were e.g. a Vec<T> value, then you could as well use Vec::new() as a sentinel without having to worry about wasteful memory allocations.

An alternative to sentinels: prevent unwinding through abort

There is a crate called take_mut on crates.io that offers a convenient alternative to installing a sentinel, although it does not apply in all scenarios. It also raises some interesting questions about “unsafe composability” that worry me a bit, which I’ll discuss at the end.

To use take_mut to solve this problem, we would rewrite our add_link function as follows:

fn add_link(&mut self) {
  take_mut::take(&mut self.chain, |chain| {
      Chain::with(chain)
  });
}

The take function works like so: first, it uses unsafe code to move the value from self.chain, leaving uninitialized memory in its place. Then, it gives this value to the closure, which in this case will execute Chain::with and return a new chain. This new chain is then installed to fill the hole that was left behind.

Of course, this begs the queston: what happens if the Chain::with function panics? Since take has left a hole in the place of self.chain, it is in a tough spot: the answer from the take_mut library is that it will abort the entire process. That is, unlike with a panic, there is no controlled shutdown. There is some precedent for this: we do the same thing in the event of stack overflow, memory exhaustion, and a “double panic” (that is, a panic that occurs when unwinding another panic).

The idea of aborting the process is that, unlike unwinding, we are guaranteeing that there are no more possible observers for that hole in memory. Interestingly, in writing this article, I realized that aborting the process does not compose with some other unsafe abstractions you might want. Imagine, for example, that you had memory mapped a file on disk and were supplying an &mut reference into that file to safe code. Or, perhaps you were using shared memory between two processes, and had some kind of locked object in there – after locking, you might obtain an &mut into the memory of that object. Put another way, if the take_mut crate is safe, that means that an &mut can never point to memory not ultimately “owned” by the current process. I am not sure if that’s a good decision for us to make – though perhaps the real answer is that we need to permit unsafe crates to be a bit more declarative about the conditions they require from other crates, as I talk a bit about in this older blog post on observational equivalence.

My recommenation

I would advise you to use some variant of the sentinel pattern. I personally prefer to use a “signaling sentinel”5 like Option if it would be a bug for other code to read the field, unless the range of code where the value is taken is very simple. So, in our original example, where we just invoked Chain::new, I would not bother with an Option – we can locally see that self does not escape. But in the variant where we recursively invoke methods on self, I would, because there it would be possible to recursively invoke self.add_link or otherwise observe self.chain in this intermediate state.

It’s a bit annoying to use Option for this because it’s so explicit. I’ve sometimes created a Take<T> type that wraps a Option<T> and implements DerefMut<Target = T>, so it can transparently be used as a T in most scenarios – but which will panic if you attempt to deref the value while it is “taken”. This might be a nice library, if it doesn’t exist already.

One other thing to remember: instead of using a sentinel, you may be able to avoid moving altogether, and sometimes that’s better. For example, if you have an &mut Vec<T> and you need ownership of the T values within, you can use the drain iterator method. The only real difference from drain vs into_iter is that drain leaves an empty iterator behind once iteration is complete.

(Similarly, if you are writing an API and have the option of choosing between writing a fn(self) -> Self sort of signature vs fn(&mut self), you might adopt the latter, as it gives your callers more flexibility. But this is a bit subtle; it would make a good topic for the Rust API guidelines, but I didn’t find it there.)

Discussion

If you’d like to discuss something in this post, there is a dedicated thread on the users.rust-lang.org site.

Appendix A. Possible future directions

Besides creating a more ergonomic library to replace the use of Option as a sentinel, I can think of a few plausible extensions to the language that would alleviate this problem somewhat.

Tracking holes

The most obvious change is that we could plausibly extend the borrow checker to permit moves out of an &mut, so long as the value is guaranteed to be replaced before the function returns or panics. The “or panics” bit is the tricky part, of course.

Without any other extensions to the language, we would have to consider virtually every operation to “potentially panic”, which would be pretty limiting. Our “motivating example” from this post, for example, would fail the test, because the Chain::with function – like any function – might potentially panic. The main thing this would do is allow functions like std::mem::replace and std::mem::swap to be written in safe code, as well as other more complex rotations. Handy, but not earth shattering.

If we wanted to go beyond that, we would have to start looking into effect type systems, which allow us to annotate functions with things like “does not panic” and so forth. I am pretty nervous about taking that particular “step up” in complexity – though there may be other use cases (for example, to enable FFI interoperability with things that longjmp, we might want ways to for functions to declare whether they panic and how anyway). But it feels like at best this will be a narrow tool that we wouldn’t expect people to use broadly.

In order to avoid annotation, @eddyb has tossed around the idea of an “auto trait”-style effect system. Basically, you would be able to state that you want to take as argument a “closure that can never call the function X” – in this case, that might mean “a closure that can never invoke panic!”. The compiler would then do a conservative analysis of the closure’s call graph to figure out if it works. This would then permit a variant of the take_mut crate where we don’t have to worry about aborting the process, because we know the closure never panics. Of course, just like auto traits, this raises semver concerns – sure, your function doesn’t panic now, but does that mean you promise never to make it panic in the future?6

Permissions in, permissions out

There is another possible answer as well. We might generalize Rust’s borrowing system to express the idea of a “borrow that never ends” – presently that’s not something we can express. The idea would be that a function like add_link would take in an &mut but somehow express that, if a panic were to occur, the &mut is fully invalidated.

I’m not particularly hopeful on this as a solution to this particular problem. There is a lot of complexity to address and it just doesn’t seem even close to worth it.

There are however some other cases where similar sorts of “permission juggling” might be nice to express. For example, people sometimes want the ability to have a variant on insert – basically a function that inserts a T into a collection and then returns a shared reference &T to inserted data. The idea is that the caller can then go on to do other “shared” operations on the map (e.g., other map lookups). So the signature would look a little like this:

impl SomeCollection<T> {
  fn insert_then_get(&mut self, data: T) -> &T {
    //
  }
}

This signature is of course valid in Rust today, but it has an existing meaning that we can’t change. The meaning today is that the function requires unique access to self – and that unique access has to persist until we’ve finished using the return value. It’s precisely this interpretation that makes methods like Mutex::get_mut sound.

If we were to move in this direction, we might look to languages like Mezzo for inspiration, which encode this notion of “permissions in, permissons out” more directly7. I’m definitely interested in investigating this direction, particularly if we can use it to address other proposed “reference types” like &out (for taking references to uninitialized memory which you must initialized), &move, and so forth. But this seems like a massive research effort, so it’s hard to predict just what it would look like for Rust, and I don’t see us adopting this sort of thing in the near to mid term.

Panic = Abort having semantic impact

Shortly after I posted this, Gankro tweeted the following:

I actually meant to talk about that, so I’m adding this quick section. You may have noticed that panics and unwinding are a big thing in this post. Unwinding, however, is only optional in Rust – many users choose instead to convert panics into a hard abort of the entire process. Presently, the type and borrow checkers do not consider this option in any way, but you could imagine them taking it into account when deciding whether a particular bit of code is safe, particularly in lieu of a more fancy effect system.

I am not a big fan of this. For one thing, it seems like it would encourage people to opt into “panic = abort” just to avoid a sentinel value here and there, which would lead to more of a split in the ecosystem. But also, as I noted when discussing the take_mut crate, this whole approach presumes that an &mut reference can only ever refer to memory that is owned by the current process, and I’m not sure that’s something we wish to state.

Still, food for thought.

Footnotes

  1. I really like this Lego analogy. You’ll just have to bear with me.

  2. std::mem::replace is a super useful function in all kinds of scenarios; worth having in your toolbox.

  3. OK, maybe I’m taking this analogy too far. Sorry. I need help.

  4. I bet you were wondering what that counter field was for – gotta admite that Chekhov’s Gun action.

  5. i.e., some sort of sentinel where a panic occurs if the memory is observed

  6. It occurs to me that we now have a corpus of crates at various versions. It would be interesting to see how common it is to make something panic which did not used to, as well sa to make other sorts of changes.

  7. Also related: fractional permissions and a whole host of other things.

Read the whole story
mithrandir
2127 days ago
reply
Share this story
Delete

Venus, Jupiter, and Noctilucent Clouds

1 Share

Have you seen the passing planets yet? Have you seen the passing planets yet?


Read the whole story
mithrandir
3355 days ago
reply
Share this story
Delete

Dogs in space

3 Comments and 14 Shares

Confession: I once told my students something I knew wasn’t true. It was during a lecture on the Space Race, on Sputnik 2, which carried the dog Laika into space in November 1957. I told them about how the Soviets initially said she had lived a week before expiring (it was always intended to be a one-way trip), but that after the USSR had collapsed the Russians admitted that she had died almost immediately because their cooling systems had failed. All true so far.

But then one bright, sensitive sophomore, with a sheen on her eyes and a tremble in her voice, asked, “But did they at least learn something from her death?” And I said, “oh, um, well, uh… yes, yes — they learned a lot.”

Which I knew was false — they learned almost nothing. But what can you do, confronted with someone who is taking in the full reality of the fact that the Soviets sent a dog in space with the full knowledge it would die? It’s a heavy thing to admit that Laika gave her life in vain. (In subsequent classes, whenever I bring up Sputnik, I always preempt this situation by telling the above story, which relieves a little of the pressure.)

A Soviet matchbox with a heroic Laika, the first dog in space. Caption: "First satellite passenger — the dog, Laika." Want it on a shirt, or a really wonderful mug?

A Soviet matchbox with a heroic Laika, the first dog in space. Caption: “First satellite passenger — the dog, Laika.” Want it on a shirt, or a really wonderful mug?

I’m a dog person. I’ve had cats, but really, it’s dogs for me. I just believe that they connect with people on a deeper level than really any other animal. They’ve been bred to do just that, of course, and for a long time. There is evidence of human-dog cohabitation going back tens of thousands of years. (Cats are a lot more recently domesticated… have been around a mere thousand years… and it shows.) There are many theories about the co-evolution of humans and dogs, and it has been said (in a generalization whose broadness I wince at, but whose message I endorse) that there have been many great civilizations without the wheel, but no great civilizations without the dog.

So I’ve always been kind of attracted to the idea of dogs in space. The “Mutniks,” as they were dubbed by punny American wags, were a key, distinguishing factor about the Soviet space program. And, Laika aside, a lot of them went up and came back down again, providing actually useful information about how organisms make do while in space, and allowing us to have more than just relentlessly sad stories about them. The kitsch factor is high, of course.

A friend of mine gave me a wonderfully quirky and beautiful little book last holiday season, Soviet Space Dogs, written by Olesya Turkina, published by FUEL Design and Publishing. According to its Amazon.com page, the idea for the book was hatched up by a co-founder of the press, who was apparently an aficionado of Mutnikiana (yes, I just invented that word). He collected a huge mass of odd Soviet (and some non-Soviet) pop culture references to the Soviet space dogs, and they commissioned Turkina, a Senior Research Fellow at the State Russian Museum, to write the text to accompany it. We had this book on our coffee table for several months before I decided to give it a spin, and I really enjoyed it — it’s much more than a lot of pretty pictures, though it is that, in spades, too. The narrative doesn’t completely cohere towards the end, and there are aspects of it that have a “translated from Russian” feel (and it was translated), but if you overlook those, it is both a beautiful and insightful book.

Soviet Space Dogs cover

First off, let’s start with the easy question: Why dogs? The American program primarily used apes and monkeys, as they were far better proxies for human physiology than even other mammals. Why didn’t the Soviets? According to one participant in the program, one of the leading scientists had looked into using monkeys, talking with a circus trainer, and found out that monkeys were terribly finicky: the training regimes were harder, they were prone to diseases, they were just harder in general to care for than dogs. “The Americans are welcome to their flying monkeys,” he supposedly said, “we’re more partial to dogs.” And, indeed, when they did use some monkeys later, they found that they were tough — one of them managed to worm his way out of his restraints and disable his telemetric equipment while in flight.

The Soviet dogs were all Moscow strays, picked for their size and their hardiness. The Soviet scientists reasoned that a dog that could survive on the streets was probably inherently tougher than purebred dogs that had only lived a domesticated life. (As the owner of a mutty little rescue dog, I of course am prone to see this as a logical conclusion.)

The Soviet dog program was more extensive than I had realized. Laika was the first in orbit, but she was not the first Soviet dog to be put onto a rocket. Turkina counts at least 29 dogs prior to Laika who were attached to R-1 and R-2 rockets (both direct descendants of the German V-2 rockets), sent up on flights hundreds of miles above the surface of the Earth starting in 1951. An appendix at the back of the book lists some of these dogs and their flights.

Oleg Gazenko, chief of the dog medical program, with Belka (right) and Strelka (left) at a press conference in 1960. Gazenko called this "the proudest moment of his life."

Oleg Gazenko, chief of the dog medical program, with Belka (right) and Strelka (left) at a TASS press conference in 1960. Gazenko called this “the proudest moment of his life.”

Many of them died. Turkina talks of the sorrow and guilt of their handlers, who (naturally) developed close bonds with the animals, and felt personally responsible when something went wrong. Some of the surviving dogs got to live with these handlers when they retired from space service. But when the surviving dogs eventually expired, they would sometimes end up stuffed and in a museum.

I had thought I had heard everything there was to hear about Laika, but I was surprised by how much I learned. Laika wasn’t really meant to be the first dog in space — she was the understudy of another dog who had gotten pregnant just before. Laika’s death was a direct result of political pressures to accelerate the launch before they were ready, in an effort to “Sputnik” the United States once again. The head of the dog medical program, when revealing Laika’s true fate in 2002, remarked that, “Working with animals is a source of suffering to all of us. We treat them like babies who cannot speak. The more time passes, the more I’m sorry about it. We shouldn’t have done it. We did not learn enough from the mission to justify the death of the dog.”

The Soviets did not initially focus on the identity of Laika. Laika was just listed as an experimental animal in the Sputnik 2 satellite. Rather, it was the Western press, specifically American and British journalists, that got interested in the identity, and fate, of the dog. The Soviet officials appear to have been caught by surprise; I can’t help but wonder if they’d had a little less secrecy, and maybe ran this by a few Americans, they’d have realized that of course the American public and press would end up focusing on the dog. It was only after discussion began in the West that Soviet press releases about Laika came out, giving her a name, a story, a narrative. And a fate: they talked about her as a martyr to science, who would be kept alive for a week before being painlessly euthanized.

Staged photo of Belka in a space suit.

Staged photo of Belka in a space suit.

In reality, Laika was already dead. They had, too late, realized that their cooling mechanisms were inadequate and she quickly, painfully expired. The fact that Laika was never meant to come back, Turkina argues, shaped the narrative: Laika had to be turned into a saintly hero, a noble and necessary sacrifice. One sees this very clearly in most of the Soviet depictions of Laika — proud, facing the stars, serious.

The next dogs, Belka and Strelka, came back down again. Belka was in fact an experienced veteran of other rocket flights. But it was Strelka’s first mission. Once again, Belka and Strelka were not meant to be the dogs for that mission: an earlier version of the rocket, kept secret at the time, exploded during launch a few weeks earlier, killing the dogs Lisichka and Chaika. These two dogs were apparently beloved by their handlers, and this was a tough blow. The secrecy of the program, of course, pervades the entire story of the Soviet side of the Space Race, and serves as a marked contrast with the much more public-facing US program (the consequences of which are explored in The Right Stuff, among other places).

When Belka and Strelka came back safely, Turkina argues, they became the first real Soviet “pop stars.” Soviet socialism didn’t really allow valorization of individual people other than Stakhanovite-style exhortations. The achievements of one were the achievements of all, which doesn’t really lend itself to pop culture. But dogs were fair game, which is one reason there is so much Soviet-era Mutnikiana to begin with: you could put Laika, Belka, and Strelka on cigarettes, matches, tea pots, commemorative plates, and so on, and nobody would complain. Plus, Belka and Strelka were cute. They could be trotted out at press conferences, on talk shows, and were the subjects of a million adorable pictures and drawings. When Strelka had puppies, they were cheered as evidence that biological reproduction could survive the rigors of space, and were both shown off and given as prized gifts to Soviet officials. So it’s not just that the Soviet space dogs are cool or cute — they’re also responsible for the development of a “safe” popular culture in a repressive society that didn’t really allow for accessible human heroes. Turkina also argues that Belka and Strelka in particular were seen as paradoxically “humanizing” space. By coming back alive, they fed dreams of an interstellar existence for mankind that were particularly powerful in the Soviet context.

Yuri Gagarin reported to have joked: “Am I the first human in space, or the last dog?” It wasn’t such a stretch — the same satellite that Belka and Strelka rode  road in could be used for human beings, and gave them no more space. A friend of mine, Slava Gerovitch, has written a lot about the Soviet philosophy of space rocket design, and on the low regard the engineers who ran the program had for human passengers and their propensity for messing things up. Gagarin had about as much control over his satellite as Belka and Strelka did over theirs, because neither were trusted to actually fly a satellite. The contrast between the engineering attitudes of the Soviet Vostok and the American Mercury program is evident when you compare their instrument panels. The Mercury pilots were expected to be able to fly, while poor Gagarin was expected to be flown. 

Soviet Space Dogs is a pretty interesting read. It’s a hard read for a dog lover. But seeing the Soviet space dogs in the context of the broader Soviet Space Race, and seeing them as more than just “biological cargo,” raises them from kitsch and trivia. There is also just something so emblematic of the space age about the idea of putting dogs into satellites — taking a literally pre-historic human technology, one of the earliest and most successful results of millennia of artificial breeding, and putting it atop a space-faring rocket, the most futuristic technology we had at the time.

Read the whole story
mithrandir
3359 days ago
reply
popular
3359 days ago
reply
Share this story
Delete
3 public comments
skittone
3353 days ago
reply
Click through at the end to the instrument panel comparison, too.
JayM
3353 days ago
Nifty
robmessick
3359 days ago
reply
Was just telling someone about Laika.
San Francisco, CA
dbentley
3361 days ago
reply
Anyone still use NewsBlur? Great article.
nsanch
3360 days ago
like
joeyo
3359 days ago
Blur it up!
laza
3358 days ago
++ :)
Meghan8
3358 days ago
I use it every day as a reminder to put my glasses on!
skittone
3353 days ago
Weird question. Even if you stop doing something, other people often keep right on doing it.

The latest best image of Pluto and Charon

1 Share
Raw images of Pluto document our progress to the dwarf planet! We are about 15 days away from the close encounter with Pluto. Raw images are being uploaded here, every day. Other information and goodies can be found here.
Read the whole story
mithrandir
3359 days ago
reply
Share this story
Delete
Next Page of Stories