New post about fixing Gitea on Void

2023-08-01 00:37:41 -04:00 · 2023-08-01 00:37:41 -04:00 · db1263930a
commit db1263930a
parent 1fc5f32355
2 changed files with 98 additions and 6 deletions
--- a/content/gemlog/manual_intervention_required_when_updating_gitea_to_v1_20_0_on_void.gmi
+++ b/content/gemlog/manual_intervention_required_when_updating_gitea_to_v1_20_0_on_void.gmi
@ -0,0 +1,38 @@
 Meta(
    title: "Manual intervention required when updating Gitea to v1.20.0 on Void",
    summary: None,
    published: Some(Time(
        year: 2023,
        month: 8,
        day: 1,
        hour: 4,
        minute: 37,
        second: 23,
    )),
    tags: [
        "software",
        "gitea",
    ],
 )
 ---
 My local Gitea instance runs on a Raspberry Pi 4 running Void Linux. I'm quite happy with Void itself. It's been totally rock solid and reliable since I got all of the kinks worked out about a week after the initial install. Gitea, however, has been a bit of a PITA on a few occasions prior to moving to Void, and the latest package upgrade I ran managed to break a few things. It's all back up and running now, but not without some frustration.
 After some troubleshooting, I found an obsolete setting in the config file, which once removed at least allowed the daemon to start. However, connecting to the machine with the `git` user failed with a nice cryptic error message:
 ```
 PTY allocation request failed on channel 0
 2023/07/31 23:53:42 ...s/setting/setting.go:109:LoadCommonSettings() [F] Unable to load settings from config: unable to create chunked upload directory: /usr/bin/data/tmp/package-upload (mkdir /usr/bin/data: permission denied)
 ```
 As it turns out, Gitea cannot run without certain files and directories existing, which it attempts to create on launch if they aren't there. They're supposed to be created in the home directory of the user running Gitea, in this case `git`, but if Gitea is run without having $HOME set then it will look for those paths relative to the binary, which is definitely not the desired behavior. More accurately, it needs the variable GITEA_WORK_DIRECTORY, but it derives that value from $HOME.
 What does this have to do with ssh? Gitea attempts to automatically manage the .ssh directory of it's user, and sets itself as a custom command to handle keys in .ssh/authorized_keys.
 ```
 command="/usr/bin/gitea --config=/etc/gitea/app.ini serv key-2",no-port-forwarding,no-X11-forwarding,no-agent-forwarding,no-pty,no-user-rc,restrict ssh-ed25519 <redacted>
 ```
 Now, since we've established that Gitea won't start without $GITEA_WORK_DIR being set, ssh can't authorize the git user using pubkey authentication because Gitea chokes, even though Gitea is running away happily as a daemon, having gotten the vars set correctly in the `run` script which runit uses to start it up. I'm not thrilled with this design, but there's the bug. And the fix is to change .ssh/authorized keys so that the handler is run with the environment var set:
 ```
 command="env GITEA_WORK_DIR=/var/lib/gitea /usr/bin/gitea --config=/etc/gitea/app.ini serv key-2",no-port-forwarding,no-X11-forwarding,no-agent-forwarding,no-pty,no-user-rc,restrict ssh-ed25519 <redacted>
 ```
 Here's hoping that an upcoming release makes this all less brittle. I don't like that it's overwriting .ssh/authorized_keys. I get the reason, but messing with configs is why I hate "user friendly distros" and prefer to work with FreeBSD or Linux distros which follow the KISS principle like Void. Once I've written a config, I expect it to never change unless I edit it myself. As for getting configuration from environment vars, that's kind of stupid and redundant when your application actually has a config file.
--- a/content/gemlog/polyglot_programming_experiments.gmi
+++ b/content/gemlog/polyglot_programming_experiments.gmi
@ -74,7 +74,7 @@ I think the choice to not have generics was smart for Hare, because it has simpl
 ## C
 I haven't written C in a while, and I wasn't very good at it when I did last. So it surprised me just how much fun I've been having dusting off my knowledge and diving in. There's so little abstraction that at times you're literally telling the compiler where to put these specific bits in memory, and in order to do so you have to know the endianness of the machine (unless you're just a lazy SOB and assume little endian, of which there are many in the open source world).
-For example, when it comes to to store a 32 bit integer as a series of bytes, one has to do the following.
+For example, when it comes to to store a 32 bit integer as a series of bytes, one *might* do the following.
 * create an array of 4 eight-bit integers (more depth on the subject to follow)
 * mask off the bits not needed for each of the four bytes you want to extract
 * shift the remaining bits into position
@ -82,12 +82,66 @@ For example, when it comes to to store a 32 bit integer as a series of bytes, on
 * use the resulting 8 bit int as the value of the appropriate position in the array, which depends on the endianness of the machine
 * write the array into the stream
-Needless to say, the above steps are all subject to human error. And since C has no official build system or test runner, you get to decide how you want to compile and test the code all by your lonesome. I'm a fan of just using Makefiles for the build. Taking that a step further, I want to ensure that my Makefile is portable between at least all of the BSD's and GNU make, which restricts the feature set available. I had to take myself back to school a little bit already, because I've been using GNU make so much for the past few years.
+Now notice I said one *might* do it this way. But let's see if I can make a Rust programmer scream in horror with the following snippet.
 ```
 #include <stdint.h>    // exact sized integer types
 #include <stdio.h>
-Another benefit of working on this sort of low level project is that all of the Unix interfaces I need to access are programmed in C to begin with and are generally on #include directive away. You still have to account for differences like `major` and `minor` being different widths depending on the platform, but libc on that platform will have macros to derive those numbers from the file metadata. The `mode` I'm always treating as a u16 in any event, as the higher bits on platforms where `mode` is 32 bits are used to store the filetype, not the permissions. In fact, the permissions actually fit into 13 bits and in Haggis I'm using the remaining 3 bits to store the filetype as an enum value, further reducing the metadata size.
+typedef uint8_t u8;
-The higher level languages don't really give you any advantage when doing read or write ops, which is a lot of what Haggis code is doing. Probably the only thing I'm really missing is good error handling. Sure, there's no tagged unions or bounded arrays in C, but I've made data structures that internally have an enum flag and union field and thus fulfill the same purpose as tagged unions, and it relatively easy to do a `read_all` and use the length returned to fill out the length field in a haggis node, or get the position of the null byte in a C string to get the string's length.
+union u16 {
    uint16_t val;
    u8 bytes[2];
 };
-I'm pretty sure that a couple of years ago I couldn't have done this. It seems that I've become a better programmer in the years that I've been using Rust, in spite of how much higher level it is.
+int load_u16(FILE *stream, union u16 num) {
    return fread(num.bytes, 1, 2, stream);
 }
-I'll be linking at runtime to some shared libs to get access to the cryptographic hash functions and do checksumming operations, and probably to zstd and to libpthread. I fully intend to tackle multithreading in C here and see how the result compares with Rust in terms of performance. I expect they're going to be very close, with any difference a result of the extra allocations I was talking about above. It's interesting, the Rust community has made it sound like parallelism in C is a nightmare, but honestly they Rust threading interface doesn't look much different than libpthread. Sure, in Rust when you lock a Mutex it physically locks out access to the protected data, while in C if you forgot to acquire a lock you could still access that data. It's a clever design. But the C implementation looks nowhere near as scary as it used to, and I think there's an awful lot of overselling going on when you really come down to it. Do I want to always work this way? Probably not. Do I fully trust myself to get it right every time? Again, probably not. But I'm no longer of the opinion that Rust is that huge of an advance. I'm also seeing a lot of the language as "baggage".
+int store_u16(FILE *stream, union u16 num) {
    return fwrite(num.bytes, 1, 2, stream);
 }
 ```
 I mean, that's perfectly valid C, and the compiler definitely doesn't have a problem with it. It won't crash at runtime, either, because all we're doing is providing a way to examine the value of an unsigned 16-bit integer while also providing a way to examine the underlying bytes. It's a perfectly valid access. It just feels sort of wrong coming from Rust to have this sort of capability. It's essentially being able to cast from a u16 to a two byte array of u8 and back again. Granted, if that was the entire implementation this would blow up in your face on a big endian machine because the bytes would be swapped, but that's why we have a preprocessor.
 ```
 #include <endian.h>
 #include <stdint.h>    // exact sized integer types
 #include <stdio.h>
 typedef uint8_t u8;
 union u16 {
    uint16_t val;
    u8 bytes[2];
 };
 #if __BYTE_ORDER__ == __LITTLE_ENDIAN
 // little endian functions
 #else
 // big endian functions
 #endif
 ```
 The big endian versions would just make sure to swap the byte order after the read or before the write operations. But at any rate, I find it freaking hilarious that the compiler will accept these sorts of shenanigans after spending the past few years in Rust. Feels like I'm getting away with a major crime. My code is flashing gang signs at your borrow checker. How you like them bytes?
 The astute will notice that I've pulled in stdint.h and did a typedef so I can call a u8 a u8. I might be laughing at the shenanigans that C allows, but I still think it's freaking stupid to have int, long, long long, double, long double, short etc and leave it completely up to the implementation what any of those numeric types actually mean. Honestly, I think any C programmer who isn't using exact width integers in 2023 is just being a bit of a dick at this point.
 One of the things that all three of Rust, Zig and Hare provide is tagged unions (although in Rust they're just called enum types with associated data). This is something I wish that C had at the language level, but in practice it's possible to get most of the benefit by rolling your own. Consider Haggis' optional checksumming.
 ```
 enum haggis_algorithm {
    md5,
    sha1,
    sha256,
    skip,
 };
 union haggis_sum {
    u8 md5[16];
    u8 sha1[20];
    u8 sha256[32];
 };
 struct haggis_checksum {
    enum haggis_algorithm tag;
    union haggis_sum *sum;
 };
 ```
 The difference between this "roll your own" approach and that provided at the language level by the other three languages is that you have to remember to set the tag when initiating a `haggis_checksum` struct and to read the tag before accessing the data. The language level constructs in the other languages will enforce this so you can't screw it up. But it does provide a primitive sort of polymorphism, allowing you to do some interesting things with data structures. I wouldn't have known to even try it a few years ago.