I’ve spent a few days cleaning up this site – I just deleted another third of the pages (ones I deemed crufty or irrelevant) – and also just published a rant about the Freescale FRDM boards and their terrible firmware, and how to fix it.
I just rediscovered the work(s) of Tom Murphy VII – aka tom7. He studied functional languages and type systems at CMU – and got his PhD in 2008 – but instead of going to work for Jane Street or Standard Chartered (modeling derivatives or whatever it is that they do in those places) he instead creates goofy, irreverent, and brilliant “works”. (He probably has a day job, but he’s not talking about it.)
I wanted to share two of these. One is an insane/hilarious bicycle “hack”:
The other is an insane/brilliant compiler hack. He has written a C compiler that generates an executable (in DOS EXE format) that is simulteneously a paper describing the compiler. In order to do this he had to limit himself to the subset of the x86 instruction set that is represented by printable ASCII characters – the range from 20 (hex) to 7E (hex). Since all the bytes in the file must be printable ASCII – header, data, and code – he was unable to generate ELF (BSD and Linux), Mach-O (macOS), or PE (Windows) executables, since each of these requires that certain fields in the header contain non-printable characters.
Read the paper in PDF format or plain text. The plain text version – which is simultaneously a DOS executable! – is in an unusual format: it consists of “lines” of 160 characters, padded by spaces. There are no newline or carriage return characters! If you want to read the text version on a terminal, it’s best to do something like this:
dd if=paper.txt cbs=160 conv=unblock | less -S
less from wrapping the lines. You can scroll horizontally with the arrow keys. You can also “zoom out” so your terminal is big enough to show lines 160 characters long. I tried this on my Chromebook by repeatedly pressing Ctrl-Minus. Here is the first page on a terminal 168 characters wide.
The paper was submitted to Sigbovik – a tongue-in-cheek but very technical conference that CMU puts on on (or near) April Fool’s day.
Forcing users to log in, combined with a change to the UI that Green characterizes as a “dark pattern”, greatly increases the danger of accidentally syncing the local browser history, bookmarks, etc to Google – something that some people, for privacy reasons, very much want to avoid.
Green is one such person. His blog post describes why he liked Chrome, how he was careful to never sign in to the browser, what the security and privacy implications of this change are, and why it’s such a big fuckup on Google’s part.
For a long time I too never signed in to Chrome. I finally “caved”, but I was careful to set up a sync passphrase before turning sync on – the idea being that if in fact my data is encrypted on my machine before it hits Google’s servers, there is no way for them to read it (unless of course they include their own “back door” key when they do the encryption). If I sign in to the browser on a new machine, nothing gets synced until I type in my passphrase. This suggests that Google can’t read my data.
But I can’t be sure. I’m still trusting Google, which makes me somewhat unhappy.
Green’s post got me thinking about what would be involved in de-Googling my life. I have an (old-ish) Android phone, a Chromebook (!!), and a free, grandfathered “G Suite” account for
nimblemachines.com – registered back when it had a more suitable name (“Google Docs for domains” or something like that). I’m not giving up my Chromebook, but I would be happy to give up my phone – though I’m not sure what I would replace it with. Is Apple better than Google about privacy?!?
You might think I’m stupid for not wanting to give up my Chromebook – especially if I claim to care at all about my privacy. But it’s totally possible to use a Chromebook with a crap, empty Google account and not use any Google services with the machine. With careful management of cookies – a total PITA – it’s possible to be reasonably private and secure. De-Googling and Chromebooks are not entirely incompatible.
Replacing the email service that I get from Google for
nimblemachines.com would be painful, however. Years ago I helped run a very small hosting service. I used djbdns and qmail and had a lot of fun learning how they work. But hosting email is a pain. The
smtpd part isn’t that bad (qmail and postfix and now OpenSMTPD are great options), but the anti-spam and POP/IMAP/Webmail bits are truly awful.
But do I want to find a paid email provider to replace Google’s free service? I would have to trust them. At least I know the nature my pact with the devil now, and Google’s mail service is pretty awesome in many ways.
Should I switch to ProtonMail?
Should I bite the bullet, set up a cloud server, and try running my own infrastructure again?
I’m still not sure about any of this, but in the process of “thinking out loud” and doing some Google searches related to email infrastructure I discovered SPF, DKIM, and DMARC. These are technologies to reduce spam and phishing and to improve authentication of email.
I decided to write up what I learned about making email slightly more secure.
I can’t help but throw another log onto the “ICANN fire” here.
I recently ran across two GitHub gists – here and here – that both mention an ICANN URL that lists the entire set of applications for new TLD strings: http://newgtlds.icann.org/en/program-status/application-results/strings-1200utc-13jun12-en.
If you follow that link, you’ll be redirected (from the original list) to a complicated “query” page that makes it impossible to easily get the whole list. You’d have to download and stitch together 56 HTML pages, where the original list was available as an HTML page, a PDF file, or a CSV file.
Why did ICANN make this data inaccessible? Why did they remove the original URL?
By putting that URL into the Internet Archive’s Wayback Machine I was able to get both the HTML and CSV versions of the application list out of the dumpster. The CSV is the real gold mine. I had to open it in another tab, view source, save that, and then transliterate (using
tr) carriage returns (^M) to newlines (^J) in order to view the file. (Mysteriously, lines were separated by single carriage return characters – something that isn’t normal on any current computing platform.)
Here is (my transliterated version of) the original CSV file and here is my spreadsheet version of the list. It’s interesting to sort on the “Primary contact” column. One thing that becomes immediately clear is the game that Donuts, Inc were playing. Their strangely-named LLCs are all there, and all the contacts go to
@donuts.co addresses. They also all have the same primary contact: Daniel Schindler. (I’m trying hard not to make some kind of joke about lists here.)
Another thing that becomes clear is that Donuts were not the only company playing the “hide behind lots of little companies” game. radixregistry.com in the UAE, famousfourmedia.com in Gibraltar, whatbox.co in the US, and secondgen.com in the US all applied using LLCs or other shell companies as the registrants.
Lastly, it’s interesting to note how many of the companies applying for new TLDs are domiciled in offshore money laundering centers: Cayman Islands (KY), British Virgin Islands (VG), and Gibraltar (GI), for example.
People (and companies) whose sole business is buying and selling domain names are parasites.
On top of this, they also like to shill for friends and sow FUD (fear, uncertainty, and doubt) about perfectly good organizations, businesses, and services.
My concrete example of all of these is DN Journal. They are all about the “domain name business” – aka cyber-squatting. They encourage, document, and illustrate the practice of buying up domain names, holding them until a buyer comes along who actually wants to use a domain, and then selling it to them at scalper’s prices.
I would love to find a technological solution that bans this behavior, but currently no one knows how to do it.
The best we can do is to call out these parasites and make pariahs of them.
Regarding cost, we also talked about a new development – free SSL certificates. These are being offered through Let’s Encrypt, a Google-backed Certificate Authority. As always, this is a case of “you get what you pay for.” Bill Grueninger noted that Lets Encrypt offers basic encryption only – none of the higher trust levels like verified identity. As a free service it is free to anyone – bad actors as well as good ones. Being encrypted doesn’t necessarily mean being safe if you don’t know who is behind the site. Grueninger believes this could make an already bad phishing problem worse unless a way to keep the keep the miscreants out is quickly implemented.
Actually, there is more: the
.htm suffix in their URLs shows that they use IIS. They use HTML tables for layout. And they think that Let’s Encrypt’s domain is
letsencrypt.org. Oh my.
Let’s Encrypt actually has a very thoughtful FAQ entry about why CA’s should not be in the business of policing malware and phishing.
I’ve finally written up my thoughts about Donuts, Inc and why I think they are a creepy company.
This is a small piece of a much larger story: how ICANN has fumbled and mismanaged the introduction of new top-level domains (TLDs) into the DNS.
Donuts are a big player in this new space. Because they own United TLD Holdco, they now “control” 241 of the 1578 currently-delegated TLDs. (Donuts have 201; United TLD Holdco have the other 40. I’ve got a spreadsheet to prove it. ;-))
Also today I got a Dev channel update on my Acer 14 Chromebook, and it included Linux vm/containers support!
I installed the update, followed the directions to install Android Studio and fired it up. It worked after a reboot, but because USB devices don’t yet get “forwarded” to the container it’s kinda useless to me. Also, the default Debian container with Android Studio added consumes 5.5 GB of disk space.
I’m going to stick with my kludgey Gentoo chroots for now.
It looks like certain Chromebooks are going to be gaining the ability to run Linux containers (in a VM) without going into developer mode.
This means that without compromising the security of your device you will be able to access the Linux command line, and install and run both command line and graphical applications and games – including Android Studio!
This ability is initially rolling out to Pixelbooks (on the dev channel), but will be coming eventually to many (but not all) Chromebooks. 32-bit ARM processors are not going to be supported, nor are Bay Trail machines. But many older machines are, as are the newest Broadwell, Braswell, Skylake, etc. I can’t tell from the announcement whether Haswell machines are going to be supported or not. The official device list shows these as having a 3.8 kernel, and anything 3.10 or earlier will not be supported. However, these are not on the old kernels device list.
Here is the bloggy announcement from Google. (Note the interesting domain!)
You can read all the details – including which varieties of Chromebook will be supported – in the Chromium OS docs tree. (There is other interesting stuff in there, including this.)
Once this is widely rolled out, are Chromebooks going to start to displace Macs as the kickass developer machine of choice? It will be interesting to watch.
A note for fellow Acer Chromebook 14 (edgar) owners: you can try this now in the Canary channel. (To switch to Canary you have to be in developer mode, and on either the Beta or Dev channel. Open a crosh window and type “live_in_a_coal_mine”. Then head over to “Settings > About Chrome OS” and search for updates. After it downloads the update you will be prompted to reboot.)
I took it for a spin and it started out working fine, but then something got wedged and I was unable use many of the built-in apps, including the new “Linux apps” app. Tapping on them (to start them) would cause the cursor to freeze for a few seconds, and then a new window would open up with the message that Chrome had quit unexpectedly, and would I like to restore my tabs.
So it’s certainly raw at the moment. I’ve switched back to the Dev channel from Canary and may wait for this feature to arrive on Dev before I try it out again.
This past May, GitHub quietly announced that custom domains hosted on their GitHub Pages service would be available over HTTPS. I missed this announcement, and only found out about it by accident a few days ago when a friend sent a screenshot of this site being displayed with a green lock and an “https” in the URL!
I was rather surprised – and pleased!
GitHub has teamed up with the awesome folks at Let’s Encrypt to make this possible.
I’ve wanted to SSLify this site for time. I futzed around with Amazon S3 but got bogged down with IAM and getting log buckets to work right, so I never even made it to the “SSL” part of the story. Recently I discovered Netlify but again, never signed up or tried it out.
Now I don’t have to do anything, until and unless I someday want access to my server logs...
To make sure that as many visitors as possible access this site over HTTPS, I followed GitHub’s directions for changing my DNS A records, and for setting “Enforce HTTPS” on the repository. I have some internal links that still use HTTP, but the “Enforce HTTPS” setting redirects them, and this seems to work well. We’ll see if it screws up Google Search Console (that is, more than it is already screwed up by Google).
Last night I tried an experiment: to see how easy it is to set up an HTTPS-enabled site from scratch on GitHub Pages. The answer: super easy! And it’s surprisingly fast. Within a minute or two of pushing my tiny site it had a valid SSL cert from Let’s Encrypt!
I decided to document the process of getting started with GitHub Pages.
Today’s rant is less about technology per se and more about the tech press.
I’ve been researching monochrome laser printers, and finding that the reviews published by CNet, PCWorld, and the like are singularly unhelpful – mostly because they have weird ideas about cost. They seem to think that a low purchase price is more important than low running costs. I disagree.
This printer is going to be for my own personal use, and I don’t print a lot, so running cost isn’t a huge deal for me, but for many people, and especially for business use, it should probably be part of the selection process. I’ve had my share of annoyances with inkjet printers and dried-up ink; that’s the main reason a laser seems like a better choice (again, for me). I also like the print quality of lasers, and don’t especially need color. I want to print out datasheets, papers, manuals, and things like that. And experiment with TeX and METAFONT, and PostScript and PDF technologies.
In general when buying “tools” I like to spend a bit more to get the real deal. Recently I was in the market for a DMM (digital multimeter), and after reading and watching tons of reviews I decided on a Keysight U1233A. This is a true RMS “electrician’s” meter, but I think it will serve me well for doing simple electronics troubleshooting too. And if I ever play around with power electronics (PWM for motors or power inverters) it will be perfect.
It cost me $170 (from Mouser), which the popular press would probably consider expensive. But in its favor are the following:
- Keysight is an established test and measurement company – in fact, it’s the original T&M company. It’s the newest incarnation of Hewlett-Packard, and – like the original HP – it is solely focussed on T&M.
- Safety is an important issue with DMMs. The cheaper ones tend to be cheaply built, and if they blow up in your hand/face they can cause serious injury. Keysight’s meters are intended for the professional market and are well made.
- The battery life is amazing. It uses four AAA batteries (rechargeables will work) and will run for 500 hours!
- There are interesting add-ons available. I spent $30 more to get the IR-to-USB connector so I can use it to do data collection – if I ever find an interesting application for such. ;-)
I’m trying to apply the same criteria to buying a laser printer. Which means I’m looking at “small workgroup” printers rather than at bottom-of-the-barrel consumer printers. My desiderata include:
- networked access (preferably not wireless, since everyone’s wireless firmware is probably buggier than their wired firmware)
- reasonable running costs (this mostly means limiting my search to printers that take high-capacity toner cartridges)
- paper duplexing (I’ve spent an unreasonable amount of time futzing around to get cheap inkjets to print on both sides, and something invariably goes wrong – usually the printer grabs too many pages at some point and screws up the back-to-front matching)
- PostScript support, ideally (mostly because I’m curious, but also because of painful memories – again with inkjets – of trying to coax Ghostscript to do the right thing)
I’ve looked at this category of printers several times over the last few years, and previously two printers stood out for me: the Brother HL-L5100DN, and the Oki B412DN. Both have an MSRP of $199, and neither has PostScript support. The Brother supports “BrotherScript”; the Oki’s more expensive sibling, the B432DN (MSRP $349), supports true Postscript. Both are networkable and support duplexing.
Recently I added two printers to my comparison: the Canon imageCLASS LBP214dw and the Lexmark MS321dn. Canon sell the LBP214dw for $229; the Lexmark seems to have an MSRP of around $199. Unlike the other three printers, the Canon has both wired and wireless connections; and these two both support PostScript 3.
I haven’t done a detailed comparison, but I think the running costs are comparable – if you buy the printer’s “high capacity” toner cartridge.
But here is where the tech press annoyed me. They seem to consider $200 “rather expensive”. For a fast, duplexing, networked laser printer with low running costs? I don’t agree – at all. And I wish they would be less misleading – penalizing these printers on the basis of cost, when in reality they will be, especially in a business setting, both cheaper and much less annoying to operate than their “less expensive” brethren, both inkjet and laser.
Anyway, that’s most of my rant, but I wanted to add another “axis” to the comparison – that of operating power. While in other respects the Canon might top my list (more built-in PostScript fonts than the Lexmark; wireless option if I decide I need it), it seems to draw a lot of power while printing (max 1320 W). The others are between 500 and 900 W.
Here is a handy table comparing the four (they all do duplexing and wired networking, so I’m leaving those out of the comparison):
Max Avg Memory PPM power power PS3 PS fonts ============================================= Brother HL-L5100DN 256 MB 42 n/a 620 W no n/a Oki B412dn 512 MB 35 900 W 560 W no n/a Canon LBP214dw 1 GB 40 1320 W n/a yes 136 Lexmark MS321dn 512 MB 38 n/a 520 W yes 91
PPM is print speed in pages per minute, letter, simplex. Not all manufacturers specify the duplex speed, so it’s hard to compare.
Paper handling for all four is comparable.
The Brother’s smaller memory could be a concern. This can slow it down when doing graphics-heavy printing, or printing PostScript using the BrotherScript emulation. The other printers are going to better choices for this kind of workload. I also remember reading online consumer reviews saying that the Brother can have trouble waking from sleep, and also can have trouble with “specks” of toner on the page, especially the first page. These sound like annoyances that could really be a problem in the long term...
It’s a bit difficult to compare the cost-per-page, since toner cartridge prices vary a lot, depending on the supplier, but I think all of these come in around 2 cents per page when using a high-capacity toner cartridge. I should try to do a more careful comparison!
I hope this helps someone else who is shopping for a printer!
I upgraded my cable service recently, from 75 Mbps to 150 Mbps, but didn’t notice any difference when running speedtest.net on my wirelessly-connected machines.
Turns out that my wireless router – a Linksys E1200, running the “tomato by shibby” firmware (which is awesome, by the way) – has fast Ethernet ports (100 Mbps), even though it is capable of wireless N speeds (300 Mbps). Does that make sense to you?
I didn’t think so.
I decided – foolishly, it turns out – to replace it with something “better”, and found another Linksys – the EA6700 – on sale at Best Buy for $46. These are N/AC, capable of crazy speeds, have gigabit Ethernet ports, and can run tomato and dd-wrt firmware ... supposedly.
I got the new routers today (I bought two) and decided to try them out. The stock firmware is truly awful, as I expected. I was able to get 135 Mbps over wireless, which is awesome, but the IPv6 support was weirdly broken in a similar way to the stock E1200 firmware. (“tomato by shibby” totally fixes this – one reason why it’s awesome.)
Given the awfulness of the stock firmware, and armed with the knowledge that I could run tomato, I duly downloaded the newest tomato by shibby build (from May 2017). The router refused it. After Googling, I found that you have to downgrade to an earlier stock firmware which predates a change that requires that firmware updates be signed. (The release notes claim that this change was required by the FCC.)
The “official” Linksys download page for the router only has the latest firmware, but another Google search turned up a link – on linksys.com – for a previous version (22.214.171.124989).
But the router refused that too.
There is some kind of Windows tftp client that Linksys make available, but it does something different than the BSD tftp on my Mac, which did nothing.
At this point I had wasted several hours, and in the process learned that there is some kind of NVRAM bug in the replacement firmware – it sounds like it affects both tomato and dd-wrt – that causes it to randomly revert to stock.
I decided that even if, by herculean measures, I managed to get aftermarket firmware on this damn thing, it wasn’t going to be worth the trouble. In contrast, the E1200 is trivial to reflash, and it works like a champ.
The process got me asking – for the zillionth time – why is consumer WiFi hardware still such utter crap? It’s crap that’s getting crappier, if that’s possible.
It seems like there is room for someone to create a fast box with open-source software, simple administration – scriptable! with a command-line interface! – with a decent firmware environment (like CFE or U-Boot but simpler), and a three-pin serial jack (RX, TX, GND) for a console in case things go wrong (so one can boot into the firmware environment and reflash).
I would just need to find a small, cheap box to run pfSense on. The Netgate SG-1000 is cute, but I think it needs more routing possibilities – for example, at least two LAN-side ports.
What about a Soekris box? Hmm. Soekris Engineering is gone. Someone in that Hacker News conversation suggested finding a used and cheap thin client, adding a PCI network interface (for the second network), and using that. Interesting idea.
Or one of these?
Just found a couple of interesting Terence Parr talks. The first introduces ANTLRv4 (the latest and final version, he promises!) and its cool features. The beginning is good; the middle – during which he futzes with an ANTLR IDE – is a bit slow and maybe should be skipped; and the end – the introduction to the nifty kind of parser (adaptive LL(*)) that ANTLRv4 generates – is good.
The second video is a two-hour-long tutorial on building a virtual machine (to implement a language) from scratch! I only watched the beginning, but it’s probably well worth watching.
Curiously, Rob Pike has some interesting things to say about lexing and parsing as well – and whether to use tools like lex, yacc, and ANTLR, or to write our own from scratch... His talk is about lexing templates in Go.
While I’m in a microblogging mood I thought I would add something about CWEB and literate programming that I didn’t say before.
I’ve tried reading (parts of) the source to TeX and METAFONT (by running
weave and TeX on the
.web source files and looking at the resulting DVI or Postscript file) and found that I was annoyed and distracted by the “fancy” typography. In order to understand the code, I had to translate – in my head – the fancy symbols back into the sequences that I might type at the keyboard.
Most of my programs are edited at least as often as they are read, and it is distracting to have to switch between plain ASCII for editing and fancy fonts and symbols for reading. It is much better for the literate-programming tool to display the code almost exactly as written. (I do believe in typographical distinction for chunk names.)
The problem, fundamentally, is that I’m mostly going to be sitting in the editor – not reading the Postscript/PDF – and I want what I see in that context to be readable. (Hence, I think, the current popularity of “syntax coloring”. It’s a lightweight way to add a tiny bit of semantic annotation to the text without being too intrusive, and without requiring graphics or a GUI.)
I think the idea of switching back and forth – between the “authoring” format and the rendered output – is unnecessary cognitive overhead.
Another thing: WEB and CWEB were developed, in part, so that the exposition of a program to a human reader could be done in a comfortable, sensible pedagogical sequence, rather than in whatever sequence the compiler required.
weave (for WEB and Pascal) and
cweave (for CWEB and C) reorder the code to suit the reader.
tangle (for WEB and Pascal) and
ctangle (for CWEB and C) reorder the code to suit the compiler. This gives the author the freedom to choose any order of exposition.
I’ve mostly been writing Forth code for the last few years, and Forth interpreters and compilers tend to be one-pass, so it’s difficult to do forward references. Hence, one tends to build things bottom-up. Because of the granular nature of Forth (lots of small words that do one thing) it’s also easy to test bottom-up. This has the huge advantage that at each layer you’re building on a lower layer that has been already tested, and each layer of your language has richer semantics. It’s like a layer cake of domain-specific languages (DSLs). This is one of the “features” of Forth that makes it so powerful. (Of course, one could write C code the same way, but Forth has the advantage that the syntax (what little there is) is extensible, and there is no distinction between system-level and application-level code, nor is there a distinction between “built-in” and user-written words. Everything is made of the same stuff.)
Forth is really a meta-language, rather than a language. One tends to start with the built-in words and then build an application-specific (domain-specific) vocabulary in which the rest of the system is written. But again, what’s strange about this is that it’s a continuum. Every word you write gets you closer to the perfect set of words for your application (if you’re doing it right).
So why this long aside?
Bottom-up is also, I think, a great order for exposition/exegesis/explanation to a human reader. You bring them along with you as you build your DSL – or layers of DSLs.
And so it has always seemed to me that Forth doesn’t really need special “literate programming” tools. If written well, and commented well, Forth code can naturally be literate code.
Last night I watched two interesting talks by George Neville-Neil about using FreeBSD in teaching and research. The first is about using FreeBSD and Dtrace to “peer inside” a running operating system, in order to learn its workings; the second is about using FreeBSD as a substrate for research.
How did it happen?
I’m a fan of RISC-V, so I watched this video – hopefully the first in a series! – about building a RISC-V processor using modern “TTL” logic chips:
Robert does a great job of explaining the RISC-V instruction set and his design choices for the registers. Too bad he’s concentrating on the least interesting part of RISC-V. Once he starts talking about the the instruction decoder and the ALU it should get interesting.
I enjoyed the video, so I decided to see what else he’s been up to. I found a video about building a CPU in an FPGA. Sounds interesting, right?
Part way through I decided that the Z-machine was too complicated. Writing an adventure in Forth would be much more interesting. Hmm – where is the source for the Crowther/Woods Adventure game anyway?
I had forgotten that there was another version of Adventure, written by Jim Gillogly, that used to be a part of every BSD system. I’m not sure about the other BSDs, but FreeBSD got rid of most of the games a number of years ago. DragonFlyBSD still has the Gillogly version of Adventure in its base system.
Thinking it would be fun to try Knuth’s version, I went to find a copy of CWEB. In order to compile a program written using CWEB you need
ctangle, a tool that extracts the C code in a compiler-friendly form.
You have to be careful untarring the CWEB source. Unlike most source packages,
cweb.tar.gz does not create a subdirectory when untarred. You have to do that yourself. Compiling with a recent version of GCC (I’m using 6.4.0) generates a lot of warnings. (There is a patched version of CWEB on Github.)
I didn’t bother to install it. After gunzipping Knuth’s
advent.w.gz I pointed
ctangle at it, got a
.c file, and compiled that. (More GCC warnings.)
I think, if I were going down this path again, I would instead try to build the Gillogly BSD version.
- it’s a simple RISC architecture, not tied to a vendor;
- the MMIX simulator might yield some interesting measurements of caching and branch-prediction behavior that might shed some light on the performance of different threading models.
We’ll see if this happens. ;-)
I’ve just added support for asciinema and asciicasts to my web site generator! Here is a simple example, recorded on my Acer Chromebook 14.
And here I am doing a pointless “demo” of my Lua-based static site generator (the engine behind this site and the muforth site):
I got an email this morning from Google, announcing that the new Search Console (BETA!) will solve all of my problems.
(Search Console is what Google now calls what used to be called Google Webmaster Tools.)
Yes, I was “excited” by the idea that I could finally learn why Google has been refusing to index a good chunk of this site.
I feel like I’ve been doing everything right. Link rel=canonical metadata? Check. Trailing slash on all URLs? Check. Sitemap.xml? Check. But nothing seemed to help. Every time I checked, a quarter to a third of the pages on the site were not indexed.
So I checked out the new Search Console. Sure enough, 45 of my pages are not in the index. But no explanation why. However, I can request, one-by-one, to have the pages indexed! But but but... I’ve already submitted a carefully-crafted sitemap file that describes exactly all the URLs I would like indexed!
Several of the URLs – page URLs lacking a trailing slash, which I don’t use at all on my site – have “crawl problems” because they exhibit a “redirect”. Yes: Github Pages is (correctly) issuing 301 (permanent) redirects for these URLs. But Google refuses to follow them for a random fraction of my site?!?
Oh, and no surprise: the Search Console (BETA!) is the usual Google user-interface dumpster fire.
“Hey Google! Code much?”
It was thinking, before this announcement, that I might switch to relying on Yahoo/Bing instead for my “webmaster tools” experience.
Given what I’ve just seen – and even notwithstanding my terrible previous experience with Bing Webmaster Tools – maybe that isn’t a bad idea.
I should also mention that I’ve made some aesthetic changes to the site, bringing it closer in style to the muforth site than previously. In fact, except for the choice of heading font, they are almost indistinguishable. (This is, IMHO, a bug and not a feature. Each site needs its own color and design scheme.)
I hope that the change away from the beautiful but somewhat hard-to-read Alegreya typefaces – I was using the serif and sans – is an improvement. It makes me sad to admit failure, but perhaps Alegreya is better suited to print than to digital screens.
(Hmm – just noticed that there is an Encode Sans semi-condensed. Maybe I should try that too...)
I hope these changes improve the readability of the site.
And – and this is just weird – I noticed, as I was searching for a URL to link to – that all the Droid fonts have vanished from Google Fonts and are now only available from Monotype! And yet, I’m still using Droid Serif, here and on the muforth site... Are the fonts suddenly going to stop working? Do I need to start searching for a new body font?!? Argh!
I decided to publish an identicon demo that I put together a while ago. It’s totally out of context (I’ve been meaning to publish a page explaining the history and current usage of identicons, but haven’t yet) but you might find it intriguing, or perhaps even beautiful.
Also, sharing it might shame me into publishing my other thoughts on the subject!
Happy New Year!
I’d thought I’d ring in 2018 by deleting a third of my web site.
Google can’t seem to be bothered to index all of it (they only tell me that they have indexed 110-ish out of 140-ish pages, but won’t tell me which ones) and there was a lot of old mossy junk in there that I don’t care about anymore, and that no one else probably ever cared about.
I doubt I’m done purging, but I thought I would make a first pass at an early spring cleaning.
I also hope to write more in 2018! It’s been a bit ridiculous. I have several long rants I want to write that I can never seem to get around to. I’d love to change that this year. A weekly rant? Wouldn’t that be nice?
Maybe none of this matters, anyway. But it’s something to do. ;-)
Read the 2017 journal.