Cycling | Education | Emacs | Exercise | Politics | Reading | Ruby | Sysadmin | 
Tiny Tiny RSS
Tiny Tiny RSS is a simple syndication application. I’ve been using RSS syndication readers for a long time. I just don’t have time to scour the web to check if a website had added new content. I started with Google Reader a long time ago (as did many of us), but was also sad that it went away. So, then I moved to Newsblur. This suited me and helped me read my blog posts for another year or so. It also has (or had - I haven’t checked recently) an open source model. However, with the pricing it was offering, I didn’t need to host it myself as it was doing just a fine job, for the price that I paid. Then the price went up. For some reason, I felt that it was too much. I was probably hasty, but also ready for a change.
I switched my feeds to The Old Reader. This was an obvious attempt to recreate the experience with Google reader.
Recently, I installed TinyTinyRss on a raspberry pi. There was a package already, but it attempted to install Apache - but I was already running Nginx, so didn’t want this extra dependency - I therefore downloaded the source code, put it in a directory that was served via the webserver and everything worked instantly. I had already creaed a database and user/pass combination in mariadb, so the setup was simple. At this point, I imported an ~.ompl~ file that I already had. I then read about running the update script. I started a tmux terminal and issued the command ~/usr/bin/php /path/to/tt-rss/update.php –feeds –quiet~ and this runs automatically.
You can add feeds using the following dialogue:
The output is really good to look at:
There is also an app for your phone to hook up to your server, so you can read your articles on the go.
Verify Azure Blob Storage Automatically
This blog post outlines a problem I had whereby there were lots of files in Azure storage and I wanted to check that they had been uploaded correctly.
Problem
You want to verify lots of files that have uploaded to Azure Blob Storage? Look no further. https://github.com/kabads/md5sum.
So, I had to upload a lot of media assets for work in a hurry as a server was being shut down. We had Azure, and as these were static files, that seemed like a good solution. I think in total, there was roughly 200,000 files. I wanted to md5sum them at each part of the stage. I did that for the huge 5Gb zip file I was given and asked my colleague who provided it to me to do the same. It was good. I could do this on all my machines, but not once the zip file was unzipped.
Solution
So, I wrote a script[0]. Doing the local verification was fairly easy.
Doing the Azure solution was not so easy. Azure store the md5sum in a weird way and lots of people have written about this. Most of my research returned this kind of post. No one seemed to have my huge amount of files problem, but just had the problem whereby the md5sum format wasn’t the same. I tried reverse engineering the problem, but found it tricky. Then I hit upon gold-dust:
import binascii ... remote_md5 = binascii.hexlify(b)
This turned the md5sum stored in Azure into something that could be compared (and was the same as the typical md5sum command that you would run locally.
How to handle Azure blobs individually with the Python Azure SDK
Get a BlobServiceClinent:
blob_service_client = BlobServiceClient.from_connection_string(connection_str)
The connection string is usually picked up from an environment variable that you have set up locally (and is provided nicely in the Azure console).
Then, get a container client:
container = blob_service_client.get_container_client(container=container_name)
Then, with the container, you can list all the blobs:
blob_list = container.list_blobs()
Once you have a list of blobs, you can iterate through them and then get a blob client:
for blob in blob_list: blob_client = blob_service_client.get_blob_client(container=container_name, blob=blob) ...
and then then the blob properties and the md5_properties:
a = blob_client.get_blob_properties() b = a.content_settings.content_md5
Once I had that I could use my
binascii.hexlify()
magic and everything would be great to write out to a file.This file only really solves my problem, but please feel free to run with it and make adaptions. I’m interested in any pull requests that improves it. It does need less ‘hard-coding’.
[0] https://github.com/kabads/md5sum
Apache Rewrite Mod
mod_rewrite
is a powerful module that Apache can utilize. It is a way of rewriting URLs, modifying the request that Apache recieves. This could be a moved document, or enforcing SSL (rewriting the URL from http to https).This is a complex subject, which cannot be covered here, but for further information, refer to the full documentation. Rewrite rules can exist in a .htaccess file, or the main configuration file, or preferably a
<Directory>
stanza.mod_rewrite
uses Regex (compatible with Perl) for its pattern matching engine. This is nearly unlimited in searching across a range of URLs.Enable Rewrite Engine
To enable
mod_rewrite
you should include the following in you Apache Documentation:RewriteEngine on
A restart of Apache will be required to load the engine.
Declaring a Rewrite Rule
To declare a rule you will have something similar to the code below:
RewriteRule ^/old.html$ new.html [R]
This will redirect a request that Apache receives for old.html page to new.html page. The
^
indicates that it must be the initial part of the request, and the$
indicates that it is the end of the request. If/subdir/old.html
is passed, it will fail this search pattern as/subdir/
is at the beginning of the pattern. This is in line with Regular Expressions pattern matching.These rules can be embedded within a particular directory, using the
<Directory
stanza:<Directory /var/www/html/subdirectory> RewriteEngine on RewriteRule "^old.html$" "new.html" </Directory>
The above rule only applies to the directory named
subddirectory
.Rewrite Flags
At the end of each
RewriteRule
is a set of flags that determines what should be done - these are enclosed in a set of square brackets. One of the most common is[R]
which is a redirect, carried out at the browser level (issued by the webserver).A full list of flags is documented at https://httpd.apache.org/docs/2.4/rewrite/flags.html.
Regular Expressions and mod_rewrite
Character Meaning Example . Match any character c.t matches cat + Repeats the previous match one or more times a+ a, aa, aaa * Repeats the previous match zero or more times a* matches the same as a+ but will also match an empty string ? Makes the match optional colou?r will match color and colour \ Escape the next character . will match a . (dot) and not any single character as explained above ^ Called an anchor, matches the beginning of thes string ^a will match a string that begins with a $ The other anchor that matches the end of a string a$ will match a string that ends with a ( ) Groups several characters into a single unit, and captures a match for use in a backreference (ab)+ matches abababab - the + applies to the group [ ] A character class - matches one of the characters c[uoa]t matches cut, cot, cat [^ ] Negative character class - matches any character not specified c[^/] matches cat or c=t but not c/t mod_rewrite Log Level for < v2.4
The
mod_rewrite
will write to the usual apache log files.For previous versions (pre Apache 2.4) the directive
RewriteLogLevel
will set the level of logging written, ranging in values from 0-9 with 0 being no logging and 0 being the most verbose. Logs will appear aspass through
lines in the file.mod_rewrite Log Level for v2.4
With the current version of Apache (2.4),
mod_rewrite
, the older methods of controlling logging have now been replaced by a new per-module logging method:LogLevel alert rewrite:trace3
Bike-packing preparation
Due to covid-19, we are all currently in lock-down. I was planning to go bike-packing this spring, but cannot get out. But that didn’t stop me from having some fun in the garden with a tarpauline:
Rsyslog configured to monitor journald /run/systemd/journal/syslog socket
Rsyslog (aka syslog) can pull in logs from journald, via the socket (using imuxsock module) or a specific module that taps in to a socket that is journald runs (imjournal) with very little config. imjournal specialises in logging that is structured (e.g. logs that follow json structure) and then filtering or querying logs. As a result, imjournal is more expensive. These modules are already loaded by default inrsyslog.conf file and do not need adding in any other configuration files.
To use a socket (
imuxsock
) module instead of theimjournal
module, turn off persistent logging in journald by removing the directory/var/log/journal
and settingStorage=auto
in/etc/systemd/journal.conf
. Once you restart journald, it will not write logs to disk, but instead to a virtual file location in/run/systemd/journal
. This runs the risk that we may lose some logs if they are not persisted. Therefore, we need to get rsysolg to moniter this socket (thereby writing them to disk and to papertrail at the same time).Rsyslog has two ways of pulling in journald logs. One is through a module called
imjournal
which is good as structuring log files. However, another more lightweight method is through creating a socket/run/systemd/journal/syslog
. This socket is less expensive.To create this socket, you need to look at the systemd unit file for rsyslog.
Two lines are usually commented out (with the character ‘;’).
[Unit] ... ;Requires=syslog.socket ... [Install] WantedBy=multi-user.target ;Alias=syslog.service
Remove the ; characters and restart syslog with
systemctl restart rsyslog
and the/run/systemd/journal/syslog
file should be there.If you are coming from a default CentOS install, rsysolg will keep a position of where it is in the journal log. This can cause problems whilst configuring the socket. By deleting the state file,
/var/log/rsyslog/imjournal.state
, you will prevent these problems. Once you restart rsyslog, this file will be recreated with the new journal position (which is something we don’t need to worry about, but rsyslog may throw errors if you don’t force this ‘refresh’).Once you have a fresh state, you need to create a virtual file
/run/systemd/journal/syslog
- once this is in place, rsyslog is ready to pull logs from journald. To create this socket/file, you need to look at the systemd unit file for rsyslog/lib/systemd/system/rsyslog.service
.Remove the comment ( ; ) characters and restart syslog with systemctl restart rsyslog and the /run/systemd/journal/syslog file should be there.
Then a directive needs to be set for rsyslog to look at that socket, so this line should be included in rsyslog config (preferably /etc/rsyslog.d/##-journald.conf):
# /etc/rsyslog.d/48-journal.conf $SystemLogSocketName /run/systemd/journal/syslog
There is also a symlinked file held by systemd that lets rsyslog manage the /run/sytemd/journal/syslog socket. This file can be symlinked with
ln -s /lib/systemd/system/rsyslog.service /etc/systemd/system/syslog.service
.# /etc/systemd/system/syslog.socket [Unit] Description=System Logging Service Requires=syslog.socket Wants=network.target network-online.target After=network.target network-online.target Documentation=man:rsyslogd(8) Documentation=http://www.rsyslog.com/doc/ [Service] Type=notify EnvironmentFile=-/etc/sysconfig/rsyslog ExecStart=/usr/sbin/rsyslogd -n $SYSLOGD_OPTIONS Restart=on-failure UMask=0066 StandardOutput=null Restart=on-failure [Install] WantedBy=multi-user.target Alias=syslog.service
The
/etc/rsyslog.conf
file should look like this (the change here is the$OmitLocalLogging off
directive, which is usually on, but must be off for the rsyslog socket to work):# /etc/rsyslog.conf $PreserveFQDN off ################# #### MODULES #### ################# $ModLoad imuxsock # provides support for local system logging - via logger command $ModLoad imklog # provides kernel logging support $ModLoad immark # provides --MARK-- message capability $ModLoad imjournal ########################### #### GLOBAL DIRECTIVES #### ########################### # # Use traditional timestamp format. # To enable high precision timestamps, comment out the following line. # $ActionFileDefaultTemplate RSYSLOG_TraditionalFileFormat # Filter duplicated messages $RepeatedMsgReduction on # # Set temporary directory to buffer syslog queue # $WorkDirectory /var/lib/rsyslog # # Set the default permissions for all log files. # $FileOwner root $FileGroup root $FileCreateMode 0600 $DirCreateMode 0755 $Umask 0022 # # Set other directives # $OmitLocalLogging off $IMJournalStateFile imjournal.state
The journald.conf file needs one change (without this change, most logs will be logged twice - once by rsyslog itself and then also by journald sending it to rsyslog:
# /etc/systemd/journald.conf ... ForwardToSyslog=no ...
After restarting both journald (
systemctl restart systemd-journald
) and rsyslog (systemctl restart rsyslog
) you will be able to test both logs withjournalctl -f
andtail -f /var/log/messages
- then issue the commandlogger -p daemon.warn "This is only a test."
.This should be written to both logs at the same time.
subscribe via RSS>