Rust: What the fuck is a borrow checker and why is it yelling at me?

The Something Awful Forums > Discussion > Serious Hardware/Software Crap > The Cavern of COBOL > Rust: What the fuck is a borrow checker and why is it yelling at me?

Urit: Oct 22, 2010

So I'm experimenting with Rust after using Go for a while and I'm having trouble with the classic threading model (I really like Go's process model/goroutine stuff but I want generics and immutability). Are there any solid libraries that let me do CSP-style stuff/selects over channels etc. easily?

# ¿ Mar 6, 2016 23:32

Adbot: ADBOT LOVES YOU

# ¿ May 3, 2024 00:19

Urit: Oct 22, 2010

So I'm porting some stuff from Go to Rust to see what it's like, and I'm trying to figure out what the Rust version of this code is. My Go code looks like this:

code:

type Decoder interface {
	Decode(data []byte) (value message.Message, err error)
}

type bindFunc func(c *toml.TomlTree) (Decoder, error)

type registry struct {
	l        sync.RWMutex
	decoders map[string]Decoder
	defaults map[string]Decoder
	binders  map[string]bindFunc
}

// Global static variable
var r = registry{
	decoders: make(map[string]Decoder),
	defaults: make(map[string]Decoder),
	binders:  make(map[string]bindFunc),
}

// One of the functions to hook up a config builder method
func Register(id string, f bindFunc) {
	r.l.Lock()
	r.binders[id] = f
	r.l.Unlock()
}

with a few methods to lock, write to the hashmaps, then unlock, as well as read from them the same way.

I'm trying to figure out how to do this sort of thing in Rust. So far it's been a billion errors about stuff not implementing Sized - apparently that's because traits by themselves aren't assumed to be a pointer to a struct that implements them as in Go, so I have to Box<> them. After I figured that out, it doesn't look like you can make global variables even if you wrap them in a Mutex and I need to use the lazy_static crate, but that doesn't work because Decoder doesn't implement Send so I needed to make my Boxes Box<Decoder+Send>. I'm trying to figure out why this is such a mess and what I'm doing that's so very wrong.

What I'm actually trying to do is make a bunch of struct types that implement a trait called Decoder (e.g. XMLDecoder, JSONDecoder) and then dump the output from a TOML parser into a method to configure each one and return a configured struct instance, then register them by a string name for later reference so I can pass a slice of bytes into the decode method from a bunch of different "reader" type things (e.g. I query a bunch of APIs and pass the result through a different decoder depending on some configuration on the reader side). The decode method never mutates its parent object, it just reads configuration from it, so it should be thread-safe to call decode from a whole bunch of threads at once against the same decoder. The end result is it spits out a JSON-esque enum object (the Message).

I was initially attracted to Rust because of the Enum concept and not having to use interface{} for everything and playing with reflection to see what types are inside the interface{}, but it definitely has a learning curve.

So far what I've got is this:

code:

extern crate toml;

use message::Message;
use std::error::Error;
use std::collections::HashMap;
use std::sync::Mutex;

trait Decoder {
    fn decode(&self, data: Vec<u8>) -> Result<Message, Box<Error>>;
}

type BindFunc = fn(toml::Value) -> Result<Box<Decoder>, Box<Error>>;

struct Registry {
    decoders: Mutex<HashMap<String, Box<Decoder + Send>>>,
    defaults: Mutex<HashMap<String, Box<Decoder + Send>>>,
    binders: Mutex<HashMap<String, BindFunc>>,
}

impl Registry {
    fn new() -> Registry {
        Registry {
            decoders: Mutex::new(HashMap::new()),
            defaults: Mutex::new(HashMap::new()),
            binders: Mutex::new(HashMap::new()),
        }
    }

    pub fn register(&self, name: String, b: BindFunc) {
        let mut bs = self.binders.lock().unwrap();
        let _ = bs.insert(name, b);
    }
}


lazy_static! {
    static ref r:Registry = Registry::new();
}

Urit fucked around with this message at 04:00 on Apr 24, 2016

# ¿ Apr 24, 2016 03:52

Urit: Oct 22, 2010

Ethereal posted:

I guess my first question is why do you want global state? Can you not get away with passing things around as needed?

I don't NEED it to be global (I can just create 1 registry instance and then pass it around somehow I suppose), I just need a single "registry" that the configured structs can get loaded into so I can reference the configured struct from configuration later. I thought about it some more and holy poo poo this is way different not having a garbage collector letting me poo poo objects everywhere.

In the Go code all the stuff registers itself on init e.g.

code:

func init() {
	RegisterDefault("json", &JsonDecoder{})
}

which is similar to a static class constructor in C++ - it's executed once at module load (which is program load in this case).

The config looks something like:

code:

[source.foo]
type = "file"
path = "/x/bar.txt"
decoder = "json"

[decoder.json]
some_option = true

As I said before, regardless of the globalness of the state, what I'm trying to do is at program load, register each type of source, decoder, etc (there's 6 different traits really - source, sink, transformer, decoder, splitter, encoder) with some sort of "registry" struct that maps a string name to a pointer to a struct that implements a trait so that when I go to configure the "foo" source, I can look up the "json" decoder (which is just a struct that implements the Decoder trait, which means I have a "decode" method that can take a byte array and turn it into an object), and map the "decode" method into the source so that every time it grabs a chunk of data it can call that decode method on it.

Also I'm running into lifetime errors trying to get a value out of a map and pass it back to a caller. I guess it makes sense because the map is holding onto that value and the value could be deleted, so then the caller would be referencing freed memory. I guess I have to wrap the whole thing in an Arc<Decoder> instead and clone it for every consumer of the decoder.

Urit fucked around with this message at 08:19 on Apr 24, 2016

# ¿ Apr 24, 2016 07:08

Urit: Oct 22, 2010

syncathetic posted:

Could you explain what BindFunc and Registery::binders are?

Bindfunc takes a TOML parse tree and turns it into a configured struct. It's basically a constructor - I just called it a bindfunc because Go doesn't have OO style class-based constructors and you can't scope a function to a type namespace easily like rust's <whatever>::new() inside the impl block. I was "binding" config values to struct values. A binder for a UDP listener looks like:

code:

func bindUDPListen(c *toml.TomlTree) (Source, error) {
	var errs *multierror.Error

	s := &udpListen{}

	if val, err := util.BindToString(c, "address", true); err == nil {
		if addr, uerr := net.ResolveUDPAddr("udp", val); err == nil {
			s.uaddr = addr
		} else {
			errs = multierror.Append(errs, uerr)
		}
	} else {
		errs = multierror.Append(errs, err)
	}

	s.buffer = make([]byte, maxUDPPacketSize)

	if errs.ErrorOrNil() != nil {
		return nil, errs
	}
	return s, nil
}

and then I register it with the registry as {"udplistener": func pointer to bindUDPListen} so if I need to construct a source of type "udplistener" I know which function to call to do that.

Now, your code: Thanks so much, and that's very similar to what I'm trying to do, but what the heck is decoders_cache and why is it borrowing a deref (&*) in a map call? I am confused as to why I can't just return the reference directly from the map.get() call. Also why are the hashmaps to a "usize" instead of a Box<Decoder>, and how would I insert a decoder into them? The thing is that each decoder itself is responsible for calling Register, though again, I'm not sure how I'd do that in Rust because you can't call functions in an "init" or global context as far as I can tell - maybe via std::sync::Once? I'd have to call a constructor on each decoder and add it to the map, correct?

Maybe I'm just doing this hilariously wrong - given the problem, is there a better way? The problem is: take configuration and build structs from that configuration, then allow a configured struct to reference another configured struct. Assume that ordering of config is not an issue e.g. if struct type A depends on struct type B then all structs of type B will always be configured first. This is basically dependency injection, I think.

Edit:

Got it working: https://gist.github.com/highlyunavailable/f8424d2881e2d7b2d510d114a57ed9c3

It still feels like I'm doing it wrong somehow.

Urit fucked around with this message at 21:52 on Apr 24, 2016

# ¿ Apr 24, 2016 19:03

Urit: Oct 22, 2010

Double-posting time! I think I have a better (read: more idiomatic) version now:

https://gist.github.com/highlyunavailable/0dab6e17bbace8fd10fa7c2e2f121d27

It seems like lifetimes are doing what I want - each item in the registry must last as long as the registry itself (there is no possibility of deleting it), so as long as the registry is in scope, I can guarantee that the items will not be deallocated. I'm not sure how this will interact with threads but I will only be filling the registry once in a single thread and the registries themselves will live in the "main" thread (the config will be read single-threaded) and then hopefully I can pass off an immutable reference to the decoder (via get() or default()) to another thread.

I'm still not sure how to actually do the registration but worst case I just have a big old populate function in each module that I manually add each submodule to, or an init function in each submodule that the populate function calls.

# ¿ Apr 25, 2016 19:14

Adbot: ADBOT LOVES YOU

# ¿ May 3, 2024 00:19

Urit: Oct 22, 2010

Jo posted:

I'm sorta' stuck on generics and inheritance now. I'm defining `trait Node` which has a few attributes. I'd like to make a Graph struct which has a HashMap <String, Node> in it. Is there any way to make graph accept any mix of types, so long as they implement Node? Box them?

Yup, you have to box traits because they're just a pointer to a thing that implements Node, so you don't know what size they are, which means you can't properly know what size they're going to be on the stack, so you have to put them in a Box (on the heap).

# ¿ Jun 2, 2016 02:04

The Something Awful Forums > Discussion > Serious Hardware/Software Crap > The Cavern of COBOL > Rust: What the fuck is a borrow checker and why is it yelling at me?