CyclingEducationEmacsExercisePoliticsReadingRubySysadmin

  • Tiny Tiny RSS

    Tiny Tiny RSS is a simple syndication application. I’ve been using RSS syndication readers for a long time. I just don’t have time to scour the web to check if a website had added new content. I started with Google Reader a long time ago (as did many of us), but was also sad that it went away. So, then I moved to Newsblur. This suited me and helped me read my blog posts for another year or so. It also has (or had - I haven’t checked recently) an open source model. However, with the pricing it was offering, I didn’t need to host it myself as it was doing just a fine job, for the price that I paid. Then the price went up. For some reason, I felt that it was too much. I was probably hasty, but also ready for a change.

    rss

    I switched my feeds to The Old Reader. This was an obvious attempt to recreate the experience with Google reader.

    Recently, I installed TinyTinyRss on a raspberry pi. There was a package already, but it attempted to install Apache - but I was already running Nginx, so didn’t want this extra dependency - I therefore downloaded the source code, put it in a directory that was served via the webserver and everything worked instantly. I had already creaed a database and user/pass combination in mariadb, so the setup was simple. At this point, I imported an ~.ompl~ file that I already had. I then read about running the update script. I started a tmux terminal and issued the command ~/usr/bin/php /path/to/tt-rss/update.php –feeds –quiet~ and this runs automatically.

    You can add feeds using the following dialogue:

    ttrss3

    The output is really good to look at:

    ttrss1

    There is also an app for your phone to hook up to your server, so you can read your articles on the go.

  • Verify Azure Blob Storage Automatically

    This blog post outlines a problem I had whereby there were lots of files in Azure storage and I wanted to check that they had been uploaded correctly.

    Problem

    You want to verify lots of files that have uploaded to Azure Blob Storage? Look no further. https://github.com/kabads/md5sum.

    So, I had to upload a lot of media assets for work in a hurry as a server was being shut down. We had Azure, and as these were static files, that seemed like a good solution. I think in total, there was roughly 200,000 files. I wanted to md5sum them at each part of the stage. I did that for the huge 5Gb zip file I was given and asked my colleague who provided it to me to do the same. It was good. I could do this on all my machines, but not once the zip file was unzipped.

    Solution

    So, I wrote a script[0]. Doing the local verification was fairly easy.

    Doing the Azure solution was not so easy. Azure store the md5sum in a weird way and lots of people have written about this. Most of my research returned this kind of post. No one seemed to have my huge amount of files problem, but just had the problem whereby the md5sum format wasn’t the same. I tried reverse engineering the problem, but found it tricky. Then I hit upon gold-dust:

    import binascii
    ...
    remote_md5 = binascii.hexlify(b)
    

    This turned the md5sum stored in Azure into something that could be compared (and was the same as the typical md5sum command that you would run locally.

    How to handle Azure blobs individually with the Python Azure SDK

    Get a BlobServiceClinent:

    blob_service_client = BlobServiceClient.from_connection_string(connection_str)
    

    The connection string is usually picked up from an environment variable that you have set up locally (and is provided nicely in the Azure console).

    Then, get a container client:

    container = blob_service_client.get_container_client(container=container_name)
    

    Then, with the container, you can list all the blobs:

    blob_list = container.list_blobs()
    

    Once you have a list of blobs, you can iterate through them and then get a blob client:

    for blob in blob_list:
    blob_client = blob_service_client.get_blob_client(container=container_name, blob=blob)
    ...
    

    and then then the blob properties and the md5_properties:

    a = blob_client.get_blob_properties()
    b = a.content_settings.content_md5
    

    Once I had that I could use my binascii.hexlify() magic and everything would be great to write out to a file.

    This file only really solves my problem, but please feel free to run with it and make adaptions. I’m interested in any pull requests that improves it. It does need less ‘hard-coding’.

    [0] https://github.com/kabads/md5sum

  • Apache Rewrite Mod

    Apache_HTTP_server_logo_(2016) mod_rewrite is a powerful module that Apache can utilize. It is a way of rewriting URLs, modifying the request that Apache recieves. This could be a moved document, or enforcing SSL (rewriting the URL from http to https).

    This is a complex subject, which cannot be covered here, but for further information, refer to the full documentation. Rewrite rules can exist in a .htaccess file, or the main configuration file, or preferably a <Directory> stanza. mod_rewrite uses Regex (compatible with Perl) for its pattern matching engine. This is nearly unlimited in searching across a range of URLs.

    Enable Rewrite Engine

    To enable mod_rewrite you should include the following in you Apache Documentation:

    RewriteEngine on

    A restart of Apache will be required to load the engine.

    Declaring a Rewrite Rule

    To declare a rule you will have something similar to the code below:

    RewriteRule ^/old.html$ new.html [R]

    This will redirect a request that Apache receives for old.html page to new.html page. The ^ indicates that it must be the initial part of the request, and the $ indicates that it is the end of the request. If /subdir/old.html is passed, it will fail this search pattern as /subdir/ is at the beginning of the pattern. This is in line with Regular Expressions pattern matching.

    These rules can be embedded within a particular directory, using the <Directory stanza:

    <Directory /var/www/html/subdirectory>
      RewriteEngine on
      RewriteRule "^old.html$" "new.html"
    </Directory>
    

    The above rule only applies to the directory named subddirectory.

    Rewrite Flags

    At the end of each RewriteRule is a set of flags that determines what should be done - these are enclosed in a set of square brackets. One of the most common is [R] which is a redirect, carried out at the browser level (issued by the webserver).

    A full list of flags is documented at https://httpd.apache.org/docs/2.4/rewrite/flags.html.

    Regular Expressions and mod_rewrite

    Character Meaning Example
    . Match any character c.t matches cat
    + Repeats the previous match one or more times a+ a, aa, aaa
    * Repeats the previous match zero or more times a* matches the same as a+ but will also match an empty string
    ? Makes the match optional colou?r will match color and colour
    \ Escape the next character . will match a . (dot) and not any single character as explained above
    ^ Called an anchor, matches the beginning of thes string ^a will match a string that begins with a
    $ The other anchor that matches the end of a string a$ will match a string that ends with a
    ( ) Groups several characters into a single unit, and captures a match for use in a backreference (ab)+ matches abababab - the + applies to the group
    [ ] A character class - matches one of the characters c[uoa]t matches cut, cot, cat
    [^ ] Negative character class - matches any character not specified c[^/] matches cat or c=t but not c/t

    mod_rewrite Log Level for < v2.4

    The mod_rewrite will write to the usual apache log files.

    For previous versions (pre Apache 2.4) the directive RewriteLogLevel will set the level of logging written, ranging in values from 0-9 with 0 being no logging and 0 being the most verbose. Logs will appear as pass through lines in the file.

    mod_rewrite Log Level for v2.4

    With the current version of Apache (2.4), mod_rewrite, the older methods of controlling logging have now been replaced by a new per-module logging method:

    LogLevel alert rewrite:trace3
    
  • Bike-packing preparation

    Due to covid-19, we are all currently in lock-down. I was planning to go bike-packing this spring, but cannot get out. But that didn’t stop me from having some fun in the garden with a tarpauline: IMG_20200329_165257

  • Rsyslog configured to monitor journald /run/systemd/journal/syslog socket

    Rsyslog (aka syslog) can pull in logs from journald, via the socket (using imuxsock module) or a specific module that taps in to a socket that is journald runs (imjournal) with very little config. imjournal specialises in logging that is structured (e.g. logs that follow json structure) and then filtering or querying logs. As a result, imjournal is more expensive. These modules are already loaded by default inrsyslog.conf file and do not need adding in any other configuration files.

    To use a socket (imuxsock) module instead of the imjournal module, turn off persistent logging in journald by removing the directory /var/log/journal and setting Storage=auto in /etc/systemd/journal.conf. Once you restart journald, it will not write logs to disk, but instead to a virtual file location in /run/systemd/journal. This runs the risk that we may lose some logs if they are not persisted. Therefore, we need to get rsysolg to moniter this socket (thereby writing them to disk and to papertrail at the same time).

    Rsyslog has two ways of pulling in journald logs. One is through a module called imjournal which is good as structuring log files. However, another more lightweight method is through creating a socket /run/systemd/journal/syslog. This socket is less expensive.

    To create this socket, you need to look at the systemd unit file for rsyslog.

    Two lines are usually commented out (with the character ‘;’).

    [Unit]
    ...
    ;Requires=syslog.socket
    ...
    [Install]
    WantedBy=multi-user.target
    ;Alias=syslog.service
    

    Remove the ; characters and restart syslog with systemctl restart rsyslog and the /run/systemd/journal/syslog file should be there.

    If you are coming from a default CentOS install, rsysolg will keep a position of where it is in the journal log. This can cause problems whilst configuring the socket. By deleting the state file, /var/log/rsyslog/imjournal.state, you will prevent these problems. Once you restart rsyslog, this file will be recreated with the new journal position (which is something we don’t need to worry about, but rsyslog may throw errors if you don’t force this ‘refresh’).

    Once you have a fresh state, you need to create a virtual file /run/systemd/journal/syslog - once this is in place, rsyslog is ready to pull logs from journald. To create this socket/file, you need to look at the systemd unit file for rsyslog /lib/systemd/system/rsyslog.service.

    Remove the comment ( ; ) characters and restart syslog with systemctl restart rsyslog and the /run/systemd/journal/syslog file should be there.

    Then a directive needs to be set for rsyslog to look at that socket, so this line should be included in rsyslog config (preferably /etc/rsyslog.d/##-journald.conf):

    # /etc/rsyslog.d/48-journal.conf
    $SystemLogSocketName /run/systemd/journal/syslog
    

    There is also a symlinked file held by systemd that lets rsyslog manage the /run/sytemd/journal/syslog socket. This file can be symlinked with ln -s /lib/systemd/system/rsyslog.service /etc/systemd/system/syslog.service.

    # /etc/systemd/system/syslog.socket
    
    [Unit]
    Description=System Logging Service
    Requires=syslog.socket
    Wants=network.target network-online.target
    After=network.target network-online.target
    Documentation=man:rsyslogd(8)
    Documentation=http://www.rsyslog.com/doc/
    
    [Service]
    Type=notify
    EnvironmentFile=-/etc/sysconfig/rsyslog
    ExecStart=/usr/sbin/rsyslogd -n $SYSLOGD_OPTIONS
    Restart=on-failure
    UMask=0066
    StandardOutput=null
    Restart=on-failure
    
    [Install]
    WantedBy=multi-user.target
    Alias=syslog.service
    

    The /etc/rsyslog.conf file should look like this (the change here is the $OmitLocalLogging off directive, which is usually on, but must be off for the rsyslog socket to work):

    # /etc/rsyslog.conf
    $PreserveFQDN off
    
    #################
    #### MODULES ####
    #################
    $ModLoad imuxsock # provides support for local system logging - via logger command
    $ModLoad imklog # provides kernel logging support
    $ModLoad immark # provides --MARK-- message capability
    $ModLoad imjournal
    
    
    
    ###########################
    #### GLOBAL DIRECTIVES ####
    ###########################
    
    
    #
    # Use traditional timestamp format.
    # To enable high precision timestamps, comment out the following line.
    #
    $ActionFileDefaultTemplate RSYSLOG_TraditionalFileFormat
    
    # Filter duplicated messages
    $RepeatedMsgReduction on
    
    #
    # Set temporary directory to buffer syslog queue
    #
    $WorkDirectory /var/lib/rsyslog
    
    #
    # Set the default permissions for all log files.
    #
    $FileOwner root
    $FileGroup root
    $FileCreateMode 0600
    $DirCreateMode 0755
    $Umask 0022
    
    #
    # Set other directives
    #
    $OmitLocalLogging off
    $IMJournalStateFile imjournal.state
    

    The journald.conf file needs one change (without this change, most logs will be logged twice - once by rsyslog itself and then also by journald sending it to rsyslog:

    # /etc/systemd/journald.conf
    ...
    ForwardToSyslog=no
    ...
    

    After restarting both journald (systemctl restart systemd-journald) and rsyslog (systemctl restart rsyslog) you will be able to test both logs with journalctl -f and tail -f /var/log/messages - then issue the command logger -p daemon.warn "This is only a test.".

    This should be written to both logs at the same time.

subscribe via RSS