Notes to self, 2016
2016-12-21 - convert / dehydrated / certbot / letsencrypt config
If you find yourself in the situation that you have to reuse your Letsencrypt credentials/account generated by Dehydrated (a bash Letsencrypt interface) with the official Certbot client, like me, you'll want to convert your config files.
In my case, I wanted to change my e-mail address, and the Dehydrated client offered no such command. With Certbot you can do this:
$ certbot register --update-registration --account f65c...
But you'll need your credentials in a format that Certbot groks.
With a bit of trial and error, you can came a long way converting the files:
$ ls /.../dehydrated/accounts/ACCOUNT account_key.pem registration_info.json $ mkdir -p /etc/letsencrypt/accounts/acme-v01.api.letsencrypt.org/directory/ACCOUNT $ cd /etc/letsencrypt/accounts/acme-v01.api.letsencrypt.org/directory/ACCOUNT $ cat >meta.json <<EOF {"creation_host": "my.example.com", "creation_dt": "2016-12-20T14:12:31Z"} EOF
If you have a sample Certbot regr.json
, you'll figure out what to place there
based on the contents of the Dehydrated registration_info.json
.
registration_info.json
:
{ "id": ACCOUNT_NUMBER, "key": { "kty": "RSA", "n": "MODULUS_IN_BASE64", "e": "EXPONENT_IN_BASE64" }, "contact": [], "agreement": "https://letsencrypt.org/documents/LE-SA-v1.1.1-August-1-2016.pdf", "initialIp": "IP_ADDRESS", "createdAt": "2016-12-20T14:12:31.054249908Z", "Status": "valid" }
regr.json
:
{ "body": { "contact": [], "agreement": "https://letsencrypt.org/documents/LE-SA-v1.1.1-August-1-2016.pdf", "key": { "kty": "RSA", "n": "MODULUS_IN_BASE64", "e": "EXPONENT_IN_BASE64" } }, "uri": "https://acme-v01.api.letsencrypt.org/acme/reg/ACCOUNT_NUMBER", "new_authzr_uri": "https://acme-v01.api.letsencrypt.org/acme/new-authz", "terms_of_service": "https://letsencrypt.org/documents/LE-SA-v1.1.1-August-1-2016.pdf" }
Lastly, you'll need the Certbot private_key.json
. It can be
converted from Dehydrated account_key.pem
, with the following
rsapem2json.py
python snippet:
#!/usr/bin/env python # Usage: openssl rsa -in account_key.pem -text -noout | python rsapem2json.py # Will convert the RSA PEM private key to the Letsencrypt/Certbot # private_key.json file. # # Public Domain, Walter Doekes, OSSO B.V., 2016 # # From: # -----BEGIN RSA PRIVATE KEY----- # MIIJJwsdAyjCseEAtNsljpkjhk9143w//jVdsfWsdf9sffLgdsf+sefdfsgE54km # ... # # To: # {"e": "AQAB", # "n": "2YIitsUxJlYn_rVn_8Sges...", # ... # from base64 import b64encode from sys import stdin maps = { 'modulus': 'n', 'privateExponent': 'd', 'prime1': 'p', 'prime2': 'q', 'coefficient': 'qi', 'exponent1': 'dp', 'exponent2': 'dq'} extra = {'kty': 'RSA', 'e': '<publicExponent>'} def block2b64(lines, key): found = False chars = [] for line in lines: if line.startswith(key + ':'): found = True elif found and line.startswith(' '): for i in line.split(':'): i = i.strip() if i: chars.append(chr(int(i, 16))) elif found: break assert chars, 'nothing found for {0}'.format(key) return b64encode(''.join(chars)) data = stdin.read().split('\n') conv = dict((v, block2b64(data, k)) for k, v in maps.items()) conv.update(extra) # Add exponent e = [i for i in data if i.startswith('publicExponent:')][0] e = e.split('(', 1)[-1].split(')', 1)[0] assert e.startswith('0x'), e e = ('', '0')[len(e) % 2 == 1] + e[2:] e = b64encode(''.join(chr(int(e[i:i+2], 16)) for i in range(0, len(e), 2))) conv['e'] = e # JSON-safe output. print(repr(conv).replace("'", '"'))
Don't forget to chmod 400 the private_key.json
.
2016-12-16 - mysql / deterministic / reads sql data
Can I use the MySQL function characteristic DETERMINISTIC
in
combination with READS SQL DATA
and do I want to?
TL;DR
If the following two groups of statements are the same to you, you want the
DETERMINISTIC
characteristic on your FUNCTION, even if you have
READS SQL DATA
.
SET @id = (SELECT my_func()); SELECT * FROM my_large_table WHERE id = @id; -- versus SELECT * FROM my_large_table WHERE id = my_func();
(All of this is tested with MySQL 5.7.16 and some was also tested with MySQL cluster 5.6.)
First, some background
You may or may not have run into this MySQL error at one point:
You do not have the SUPER privilege and binary logging is enabled
(you *might* want to use the less safe log_bin_trust_function_creators variable)
You may specify certain characteristics to MySQL FUNCTIONs/PROCEDUREs when creating them:
CREATE [DEFINER = { user | CURRENT_USER }] PROCEDURE sp_name ([proc_parameter[,...]]) [characteristic ...] routine_body CREATE [DEFINER = { user | CURRENT_USER }] FUNCTION sp_name ([func_parameter[,...]]) RETURNS type [characteristic ...] routine_body ... characteristic: COMMENT 'string' | LANGUAGE SQL | [NOT] DETERMINISTIC | { CONTAINS SQL | NO SQL | READS SQL DATA | MODIFIES SQL DATA } | SQL SECURITY { DEFINER | INVOKER }Source: http://dev.mysql.com/doc/refman/5.7/en/create-procedure.html
For instance. We can create a function that returns 4.
mysql> CREATE FUNCTION test_4() RETURNS INT DETERMINISTIC NO SQL RETURN 4; mysql> select test_4()\G *************************** 1. row *************************** test_4(): 4
Not very useful, but it illustrates the most basic layout of a MySQL function.
We added the DETERMINISTIC
and NO SQL
characteristics
because the function always returns the same output for the same input (deterministic)
and it contains no SQL statements (no sql).
DETERMINISTIC characteristic, and how it affects replication
If you happen to create such a function as a non-privileged user
on a slavable MySQL server
— i.e. one that creates binlogs that a MySQL slave can use to replicate the statements
(using the log_bin
mysqld setting)
— you would run into the
“You do not have the SUPER privilege and binary logging is enabled”
error.
Why?
Because MySQL could use the characteristics from the functions to determine how a statement should be replicated. In this case, the characteristics might tell the replication: if this function is used in a statement, we can execute the same statement on the slave and keep consistent slave records.
To test this out, we create a few different functions and check how they would be replicated.
-- A real deterministic function. CREATE FUNCTION f_deterministic() RETURNS INT DETERMINISTIC RETURN 4; -- A non-deterministic function. CREATE FUNCTION f_non_deterministic() RETURNS FLOAT DETERMINISTIC RETURN RAND(); -- A real deterministic function that modifies SQL data. -- This one inserts a record into test_abc. CREATE TABLE test_abc(id INT PRIMARY KEY AUTO_INCREMENT, value FLOAT NOT NULL); DELIMITER ;; CREATE FUNCTION f_modifies_sql_data() RETURNS FLOAT DETERMINISTIC MODIFIES SQL DATA BEGIN INSERT INTO test_abc (value) VALUES (5); RETURN 5; END ;; DELIMITER ;
If you run it, you get (something like) this:
mysql> SELECT f_deterministic(), f_non_deterministic(), f_modifies_sql_data(); +-------------------+-----------------------+-----------------------+ | f_deterministic() | f_non_deterministic() | f_modifies_sql_data() | +-------------------+-----------------------+-----------------------+ | 4 | 0.7306850552558899 | 5 | +-------------------+-----------------------+-----------------------+
If you use binlog-format = STATEMENT
this you get this warning:
Unsafe statement written to the binary log using statement format since BINLOG_FORMAT = STATEMENT.
The binlog contains this row; statement based replication as expected:
SELECT `testdb`.`f_modifies_sql_data`()
For ROW
and MIXED
binlog formats, the binlog looks like this:
$ mysqlbinlog /var/log/mysql/mysql-bin.000005 --base64-output=decode-rows --verbose ... ### INSERT INTO `testdb`.`test_abc` ### SET ### @1=2 ### @2=5
This was unexpected.
On this MySQL 5.7.16, for the MODIFIES SQL DATA
characteristic,
the DETERMINISTIC
property is ignored for the binlogs
and instead of replicating the query, it replicates the altered rows.
Note that that is not a bad thing. This keeps things consistent even
if the function is mislabeled and should've been NON DETERMINISTIC
.
Lying in the characteristics
What if we lie to MySQL? And tell it that the function does not modify SQL data?
No problem for MIXED based replication. With DETERMINISTIC NO SQL
the INSERT statement is still propagated in the binlog.
This appears that the MIXED/ROW based replication is unaffected by labeling/mislabeling of the function characteristics. That's one thing less to worry about. (For the tested MySQL version only! YMMV!)
Then, is there another reason to get the characteristics right?
Yes there is: query optimizations.
Query optimizations and DETERMINISTIC
Consider this table:
mysql> CREATE TABLE test_seq (id INT PRIMARY KEY NOT NULL, value FLOAT NOT NULL); mysql> INSERT INTO test_seq VALUES (1,-1), (2, -2), (3, -3), (4, -4), (5, -5), (6, -6); mysql> SELECT * FROM test_seq; +----+-------+ | id | value | +----+-------+ | 1 | -1 | | 2 | -2 | | 3 | -3 | | 4 | -4 | | 5 | -5 | | 6 | -6 | +----+-------+
What happens if we do this query:
SELECT * FROM test_seq WHERE id = f_modifies_sql_data();
Does it return 5
? Yes it does.
But it also inserts records into test_abc
,
because we told the function to do so.
mysql> DELETE FROM test_abc; mysql> SELECT * FROM test_seq WHERE id = f_modifies_sql_data(); +----+-------+ | id | value | +----+-------+ | 5 | -5 | +----+-------+ mysql> SELECT * FROM test_abc; +----+-------+ | id | value | +----+-------+ | 8 | 5 | | 9 | 5 | +----+-------+
Apparently that function was called twice. We would expect once, but
a second time is okay. But what if we relabel it as NOT DETERMINISTIC
?
mysql> DELIMITER ;; mysql> CREATE FUNCTION f_modifies_sql_data_nondet() RETURNS FLOAT NOT DETERMINISTIC MODIFIES SQL DATA BEGIN INSERT INTO test_abc (value) VALUES (5); RETURN 5; END ;; ERROR 1418 (HY000): This function has none of DETERMINISTIC, NO SQL, or READS SQL DATA in its declaration and binary logging is enabled (you *might* want to use the less safe log_bin_trust_function_creators variable)
Heh, a different error. Lets enable log_bin_trust_function_creators
.
This time it writes.
mysql> DELETE FROM test_abc; mysql> SELECT * FROM test_seq WHERE id = f_modifies_sql_data_nondet(); mysql> SELECT COUNT(*) FROM test_abc\G *************************** 1. row *************************** COUNT(*): 6
That's right! Six records.
One record inserted for every comparison against id
.
Here you clearly see the difference between DETERMINISTIC
and NOT DETERMINISTIC
:
When used in a where_condition, a DETERMINISTIC
function is called once (maybe twice) while a NOT DETERMINISTIC
function is checked for every row.
Another clear example would be: SELECT COUNT(*) FROM test_seq WHERE RAND() > 0.5;
That could return any one value {0, 1, 2, 3, 4, 5, 6}.
But when we wrap RAND()
in a DETERMINISTIC
-labelled function, the
result can only be in {0, 6}. Makes sense? Yes.
DETERMINISTIC and READS SQL DATA
On to the confusing bits: the internet does not agree on whether
DETERMINISTIC
and READS SQL DATA
can be combined.
But as you might realize at this point, this can be quite a useful combination:
First your custom FUNCTION does its magic and looks up a indexed value. Then you look up the record based on that indexes value, in a potentially huge table. You don't want the function to be called for every record.
mysql> EXPLAIN SELECT * FROM test_seq WHERE id = f_modifies_sql_data(); +----+----------+-------+---------------+---------+---------+-------+------+----------+-------+ | id | table | type | possible_keys | key | key_len | ref | rows | filtered | Extra | +----+----------+-------+---------------+---------+---------+-------+------+----------+-------+ | 1 | test_seq | const | PRIMARY | PRIMARY | 4 | const | 1 | 100.00 | NULL | +----+----------+-------+---------------+---------+---------+-------+------+----------+-------+ mysql> EXPLAIN SELECT * FROM test_seq WHERE id = f_modifies_sql_data_nondet(); +----+----------+------+---------------+------+---------+------+------+----------+-------------+ | id | table | type | possible_keys | key | key_len | ref | rows | filtered | Extra | +----+----------+------+---------------+------+---------+------+------+----------+-------------+ | 1 | test_seq | ALL | NULL | NULL | NULL | NULL | 6 | 16.67 | Using where | +----+----------+------+---------------+------+---------+------+------+----------+-------------+
The difference between instant lookup and an uber-slow where lookup over all your records.
Caching is not an issue
Lastly, one could imagine that MySQL used the DETERMINISTIC
characteristic to cache the result (ignoring the READS SQL DATA
).
This does not appear to be the case. (Again, tested with MySQL 5.7.16,
YMMV! Although I would expect them not to change semantics here lightly.)
If anything, your function is called too often. It's even called for every resultant row if used in the select_expression:
mysql> DELETE FROM test_abc; mysql> SELECT id, f_modifies_sql_data() FROM test_seq; ... mysql> SELECT COUNT(*) FROM test_abc\G *************************** 1. row *************************** COUNT(*): 6
Moral of the story: yes, your function can safely be labelled
DETERMINISTIC
even if it READS SQL DATA
.
The only things you should worry about are other non-derministic functions
(RAND, UUID, ...), limits (LIMIT) and an out-of-sync database. But if your
database is out of sync, you have more pressing issues to worry about.
And MIXED/ROW based replication appears to handle all of that properly
anyway. As we've seen, it safely replicates MODIFIES SQL DATA
functions in all cases.
This matches with the statements made on Binary Logging of Stored Programs: In general, the issues described here result when binary logging occurs at the SQL statement level. If you use row-based binary logging, the log contains changes made to individual rows as a result of executing SQL statements.
That also means that the warnings/errors related to the
log_bin_trust_function_creators
can be safely ignored
when you use anything other than STATEMENT based binary logging.
2016-12-01 - patch-a-day / pdns-recursor / broken edns lookups
Last month, our e-mail exchange (Postfix) started having trouble delivering mail to certain destinations. These destinations all appeared to be using Microsoft Office 365 for their e-mail. What was wrong? Who was to blame? And how to fix it?
The problem appeared like this:
Nov 16 17:04:08 mail postfix/smtp[13330]: warning: no MX host for umcg.nl has a valid address record Nov 16 17:04:08 mail postfix/smtp[13330]: 1D1D21422C2: to=<-EMAIL-@umcg.nl>, relay=none, delay=2257, delays=2256/0.02/0.52/0, dsn=4.4.3, status=deferred (Host or domain name not found. Name service error for name=umcg-nl.mail.protection.outlook.com type=A: Host not found, try again)
If we looked up that domain normally, we'd get a result:
$ host umcg-nl.mail.protection.outlook.com umcg-nl.mail.protection.outlook.com has address 213.199.154.23 umcg-nl.mail.protection.outlook.com has address 213.199.154.87
But if Postfix did a lookup, it failed with SERVFAIL. And interestingly, after the failed lookup from Postfix this failure response was cached in the DNS recursor.
It turned out that Postfix did a lookup with EDNS + DNSSEC
options because of the default smtp_tls_dane_insecure_mx_policy=dane
setting.
Extension mechanisms for DNS (EDNS) is used for DNSSEC. The security aware resolver — in this case Postfix — sets the EDNS0 OPT "DNSSEC OK" (DO) flag in the request to indicate that it wants to know whether the domain was secured by DNSSEC or not.
Note that the DNS recursor should do DNSSEC validation always if possible anyway — and discard the result if validation fails — but the DO-flag indicates that the caller wants to know whether it was secured at all. Postfix uses the outcome to determine whether the destination path could be forged by bad DNS entries and it updates its security position for that path accordingly.
This EDNS lookup is not new. It works on almost all DNS servers, except on older DNS implementations, like the ones Microsoft uses for Office 365.
$ dig A umcg-nl.mail.protection.outlook.com. \ @ns1-proddns.glbdns.o365filtering.com. +edns +dnssec | grep FORMERR ;; ->>HEADER<<- opcode: QUERY, status: FORMERR, id: 46904 ;; WARNING: EDNS query returned status FORMERR - retry with '+nodnssec +noedns'
Because EDNS is an extension, DNS servers are not obligated to respond sensibly to that. A FORMERR response is okay. Your local DNS recursor should parse that response and retry.
And that is where the alpha2 version of PowerDNS recursor on Ubuntu/Xenial went afoul. It did not do a second lookup. Instead, it returned SERVFAIL, and cached that response.
It had been fixed already in 9d534f2, but that fix had not been applied to the Ubuntu LTS build yet.
Download the patch to the deb-package for Ubuntu/Xenial.
Patch instructions:
$ apt-get source pdns-recursor $ patch -p0 < pdns-recursor_4.0.0~alpha2-2--osso1.patch $ cd pdns-recursor-4.0.0~alpha2 $ DEB_BUILD_OPTIONS="parallel=15" dpkg-buildpackage -us -uc -sa
Relevant reading:
EDNS
DANE trouble with Microsoft mail-protection-outlook-com;
pdns-recursor
4.0.0~alpha2-2 fails on FORMERR response to EDNS query.
2016-11-30 - patch-a-day / dovecot / broken mime parts / xenial
At times, Dovecot started spewing messages into dovecot.log
about a corrupted index cache file because of “Broken MIME parts”.
This happened on Ubuntu/Xenial with dovecot_2.2.22-1ubuntu2.2
:
imap: Error: Corrupted index cache file dovecot.index.cache: Broken MIME parts for mail UID 33928 in mailbox INBOX: Cached MIME parts don't match message during parsing: Cached header size mismatch (parts=4100...) imap: Error: unlink(dovecot.index.cache) failed: No such file or directory (in mail-cache.c:28) imap: Error: Corrupted index cache file dovecot.index.cache: Broken MIME parts for mail UID 33971 in mailbox INBOX: Cached MIME parts don't match message during parsing: Cached header size mismatch (parts=)
The problem appears to be fixed in 1bc6f1c.
Download the patch to the deb-package for Ubuntu/Xenial.
Patch instructions:
$ apt-get source dovecot $ patch -p0 < dovecot_2.2.22-1ubuntu2.2--osso1.patch $ cd dovecot-2.2.22 $ DEB_BUILD_OPTIONS="nocheck" dpkg-buildpackage -us -uc -sa
2016-11-15 - tmpfs files not found / systemd
While debugging a problem with EDNS records, I wanted to get some
cache info from the PowerDNS pdns-recursor
.
rec_control dump-cache should supply it, but I did not
see it.
# rec_control dump-cache out.txt Error opening dump file for writing: Permission denied
Doh, it's running as the pdns
user. Let's write in /tmp
.
# rec_control dump-cache /tmp/out.txt dumped 42053 records # less /tmp/out.txt /tmp/out.txt: No such file or directory
Wait what? No files?
Turns out Systemd has mapped /tmp
into a private location:
# ls -l /tmp/systemd-private-81..34-pdns-recursor.service-1..Q/tmp/out.txt -rw-r----- 1 pdns pdns 2303585 nov 15 15:36 /tmp/systemd-private-81..34-pdns-recursor.service-1..Q/tmp/out.txt
Feature.
If you know it, it won't come as a surprise.
# grep -i tmp /lib/systemd/system/pdns-recursor.service PrivateTmp=true
2016-11-02 - setting up powerdns slave / untrusted host
When migrating our nameserver setup to start using DNSSEC, a second requirement was to offload a resolver to somewhere off-network.
You want your authoritative nameservers to be distributed both accross different geographical regions, networks and top level domains. That means, don't do this:
ns1.thedomain.com
- datacenter X in Groningenns2.thedomain.com
- datacenter X in Groningen
Do do this:
ns1.thedomain.com
- datacenter X in Groningenns2.thedomain.org
- datacenter Y in Amsterdam
In our case, we could use a third nameserver in a separate location: a virtual machine hosted by someone other than us. Let's call it ns3.thedomain.io.
Between our primary two nameservers we were happily using MySQL replication to share the updates. PowerDNS works perfectly using NATIVE mode and master-master or master-slave replication.
However, now that we've started using DNSSEC we must worry about our private keys not leaking. For starters we ensure that we use TLS for the MySQL replication between ns1 and ns2. And secondly: we cannot just trust any remote virtual machine vendor with our private DNS keys. That means no MySQL slave replication to the new ns3.thedomain.io.
Luckily, PowerDNS will also function in regular slave mode, and configuring it is not hard.
On ns1: /etc/powerdns/pdns.d/pdns.local.conf
# Make sure we send updates to slaves. master=yes # we're a master for ns3.thedomain.io # Add the IP 55.55.55.55 of ns3.thedomain.io. allow-axfr-ips=127.0.0.0/8,::1,55.55.55.55 # Don't notify all slaves found through DNS. If we do not set this, # ns1 would start pushing updates to both ns2 and ns3. And ns2 does # not want any: it operates in "native" master mode, taking the config # out of the MySQL db directly. only-notify=55.55.55.55 # the IP of ns3.thedomain.io
On the new ns3.thedomain.io slave, we configure MySQL
support as usual, without any replication. Add this to the
/etc/powerdns/pdns.d/pdns.local.conf
# We slave off ns1.thedomain.com (or in fallback case from ns2.thedomain.org); # see the supermasters table in mysql pdns schema. slave = yes
Populate the supermasters
table on ns3:
mysql> select * from supermasters; +-------------+-------------------+------------+ | ip | nameserver | account | +---------------+-----------------+------------+ | 44.44.44.44 | ns1.thedomain.com | TheDomain | +-------------+-------------------+------------+
Almost there.
At this point we need to do three things:
- Replace the NATIVE type with MASTER for the relevant domains:
UPDATE domains SET type = 'MASTER' WHERE type = 'NATIVE' AND name IN ('...');
Without this change, PowerDNS will not notify the slaves of any updates! - Add an ns3.thedomain.io NS record to the relevant domains (using pdnsutil or SQL and a manually updated serial).
- Add the third nameserver to the TLD — how you do this is specific to your registrar.
At this point, your new PowerDNS on ns3.thedomain.io should start receiving records. You can check the database and/or syslog.
If some domains fail to appear — you did update the serial right? — you can force an update from ns1 using: pdns_control notify somedomain.tld
And, where on ns1 and ns2 you have the sensitive private key material and on-the-fly DNSSEC signing, on ns3 you get presigned (by ns1) data.
2016-10-28 - mysql sys schema / mysqldump failure
After upgrading the mysql-server to 5.7 and enabling GTIDs, the mysql-backup script started spewing errors.
Warning: A partial dump from a server that has GTIDs will by default include the GTIDs of all transactions, even those that changed suppressed parts of the database. If you don't want to restore GTIDs, pass --set-gtid-purged=OFF. To make a complete dump, pass --all-databases --triggers --routines --events. (...repeated for every database schema...) mysqldump: Couldn't execute 'SHOW FIELDS FROM `host_summary`': View 'sys.host_summary' references invalid table(s) or column(s) or function(s) or definer/invoker of view lack rights to use them (1356)
That's two errors for the price of one.
The first was easily fixed by doing as suggested:
- Add
--set-gtid-purged=OFF
to the per-schema mysqldump. - Add a tailing dump for ALL; this is only viable if your databases are
small or few. (If they are few, you could consider skipping the per-schema
dump.)
mysqldump $mysqlargs --quick --all-databases --triggers --routines \ --events >"$dstpath/ALL.$day.sql" && rm -f "$dstpath/ALL.$day.sql.bz2" && bzip2 "$dstpath/ALL.$day.sql"
The second error was a bit more strange. For some reason the mysql upgrade had created the tables, but not the triggers and the functions. Or they got lost during a dump restore. In any case, debugging went like this:
# /usr/local/bin/mysql-backup mysqldump: Couldn't execute 'SHOW FIELDS FROM `host_summary`': View 'sys.host_summary' references invalid table(s) or column(s) or function(s) or definer/invoker of view lack rights to use them (1356) # mysql --defaults-file=/etc/mysql/debian.cnf sys ... mysql> show create view host_summary\G ... *************************** 1. row *************************** View: host_summary Create View: CREATE ALGORITHM=TEMPTABLE DEFINER=`mysql.sys`@`localhost` SQL SECURITY INVOKER VIEW `host_summary` AS select if(isnull(`performance_schema`.`accounts`.`HOST`) ... mysql> select * from host_summary; ERROR 1356 (HY000): View 'sys.host_summary' references invalid table(s) or column(s) or function(s) or definer/invoker of view lack rights to use them mysql> select if(isnull(`performance_schema`.`accounts`.`HOST`),'background',... ... ERROR 1305 (42000): FUNCTION sys.format_time does not exist
A-ha, a missing function.
# dpkg -S /usr/share/mysql/mysql_sys_schema.sql mysql-server-5.7: /usr/share/mysql/mysql_sys_schema.sql # mysql --defaults-file=/etc/mysql/debian.cnf < /usr/share/mysql/mysql_sys_schema.sql ERROR 1064 (42000) at line 43: You have an error in your SQL syntax; check the manual that corresponds to your MySQL server version for the right syntax to use near '' at line 1
Invalid SQL? Nah, just inline semi-colons within statements. Since MySQL doesn't mind multiple statements at once, the fix was to surround the entire SQL with a new delimiter:
# ( cat /usr/share/mysql/mysql_sys_schema.sql; echo '####' ) | mysql --defaults-file=/etc/mysql/debian.cnf --delimiter='####' sys
Fixed! Now, what is that sys
database anyway?
It's a collection of views, functions and procedures to help MySQL administrators get insight into MySQL Database usage, according to the MySQL sys schema.
That might come in handy later...
2016-10-27 - copy-pasting into java applications / x11
The other day I was rebooting our development server. It has full disk encryption, and the password for it has to be specified at boot time, long before it has network access.
Even though the machine is in the same building, walking over there is obviously not an option. The machine has IPMI, like all modern machines do, so we can connect a virtual console over the local network. For that, we use the SuperMicro ipmiview tool.
Unfortunately, it's a Java application that doesn't play that well with the rest of my X window system. In particular: it doesn't do pasting from the clipboard. The middle-mouse paste doesn't work, the CTRL-V paste doesn't, and the CTRL-SHIFT-V alternative doesn't either!
That is no fun if the encrypted disk password looks something like this:
vzyxLyi8hsdQIM1zWUlzZM14jZCk2iuOZ83pyzVH
Typing...
vzyxLyi8hsdQIM1zWUlz
.. did I type the lowercase Z? Or?
No key available with this passphrase
vzyxLyi8hsdQIM1zWUlzZM14jZCk2iuOZ*3
.. drat.. too slow with the shift-release.
You can see how that gets old real quick.
What if we could fake keypress events in the IPMI console window?
Turns out we can. Behold: xpaste.
Examining events in X
Events are constantly being passed around in the X window system. You can check which windows exist in your display, like this:
$ xwininfo -root -tree ... lots of windows
You can quickly find the window of your gnome-terminal with this trick:
$ cd `mktemp -d` /tmp/tmp.ej4mWwShfr$ xwininfo -root -tree | grep `pwd` -C4 0x1400264 (has no name): () 1855x1176+65+24 +65+24 1 child: 0x1400265 (has no name): () 1855x1176+0+0 +65+24 1 child: 0x2e0000a "walter@walter-desktop: /tmp/tmp.ej4mWwShfr": ("gnome-terminal-server" "Gnome-terminal") 1855x1176+0+0 +65+24 1 child: 0x2e0000b (has no name): () 1x1+-1+-1 +64+23 0x1403447 (has no name): () 1855x1176+65+24 +65+24 1 child:
Look, a window with my name on it ;-)
But that's not where the keyboard events go. If you run xev -id 0x2e0000a (or its child) there will be zero relevant events.
Fire up xev & sleep 10; kill $! and you can sniff
events in your terminal (or a test window, depending on which version
of xev(1)
you have) for 10 seconds. It'll show things like this:
KeyPress event, serial 37, synthetic NO, window 0x4400001, root 0x25d, subw 0x0, time 154332240, (58,131), root:(1658,183), state 0x0, keycode 38 (keysym 0x61, a), same_screen YES, XLookupString gives 1 bytes: (61) "a" XmbLookupString gives 1 bytes: (61) "a" XFilterEvent returns: False KeyRelease event, serial 37, synthetic NO, window 0x4400001, root 0x25d, subw 0x0, time 154332320, (58,131), root:(1658,183), state 0x0, keycode 38 (keysym 0x61, a), same_screen YES, XLookupString gives 1 bytes: (61) "a" XFilterEvent returns: False
Promising. That's the press and release of the 'a' key.
Rolled into xpaste
Xpaste works by emitting keypress events as if the user is typing. This had already been implemented by small utility apps like crikey. Crikey sends key events into the window that currently has focus.
For our purposes we needed to improve it to make it choose the right window too. And, as seen above, picking the right window to send the events to is not necessarily trivial.
This problem was tackled by having xpaste listen for some kind of key — in this case the [ENTER]. If the user would press enter in that window, that window is the right one to paste into. It grabs the root window (the parent of all windows) and asks to get events that relate to the [ENTER] key. When the currently focused window gets the keypress, xpaste has determined the in which window ID the user wants the paste.
Combining the above, it looks like this — the translucent parts are after the enter keypress:
Yeah! It works! And it's implemented in pure python.
Fetch xpaste from github or through pip install xpaste.
2016-10-25 - packaging supermicro ipmiview / debian
Do you want to quickly deploy SuperMicro ipmiview
on your desktop?
IPMI is a specification for monitoring and management of computer hardware. Usually this is used for accessing servers in a data center when the regular remote login is not available. Think: hard rebooting a stuck machine, specifying the full disk encryption password at boot time, logging onto a machine where the remote login (ssh daemon) has disappeared.
The SuperMicro IPMI devices have an embedded webserver, but it requires Java to access
the console. And setting up a working Java accessible from your browser is a pain in the behind.
Luckily SuperMicro also offers a precompiled/packaged ipmiview
tool that works out
of the box. Unfortunately they don't provide a nice Debian/Ubuntu package for it.
But we can do that last bit ourselves, using the following recipe.
Step #1: Download IPMIView_2.12.0_build.160804_bundleJRE_Linux_x64.tar.gz
.
Save it as ipmiview_2.12.0+build160804.orig.tar.gz
. Fetch it from
the SuperMicro site:
https://www.supermicro.com/products/nfo/SMS_IPMI.cfm
Step #2: Download
ipmiview_2.12.0+build160804-1~all.debian.tar.xz
Sha256: 3ee49e132 36706bec 4c504c94 a45a569 42e9f4a ef86b0d e8ae6f9 c272bc0 c889
Step #3: Untar, patch, build:
$ tar zxf ipmiview_2.12.0+build160804.orig.tar.gz $ cd IPMIView_2.12.0_build.160804_bundleJRE_Linux_x64/ $ tar xf ../ipmiview_2.12.0+build160804-1~all.debian.tar.xz $ dpkg-buildpackage -us -uc -sa
The debian.tar.xz contains a debian/
directory
with the appropriate rules to build the package. The call
to dpkg-buildpackage
does the building.
Note that the dpkg-deb
“building package 'ipmiview'”
step takes some time.
Afterwards, you'll have these files:
$ ls -l ../ipmiview_* -rw-r--r-- 1 user user 1741 okt 25 14:41 ../ipmiview_2.12.0+build160804-1~all_amd64.changes -rw-r--r-- 1 user user 72607368 okt 25 14:41 ../ipmiview_2.12.0+build160804-1~all_amd64.deb -rw-r--r-- 1 user user 1536 okt 25 14:39 ../ipmiview_2.12.0+build160804-1~all.debian.tar.xz -rw-r--r-- 1 user user 957 okt 25 14:39 ../ipmiview_2.12.0+build160804-1~all.dsc -rw-r--r-- 1 user user 91078080 okt 25 12:19 ../ipmiview_2.12.0+build160804.orig.tar.gz
Step #4: You can now install the ipmiview_2.12.0+build160804-1~all_amd64.deb
using dpkg -i
or through your own APT repository.
Observe that the configuration files in /opt/ipmiview
are
world-writable(!) files:
$ ls -l /opt/ipmiview/*.properties -rwxrwxrwx 1 root root 91 okt 25 14:40 /opt/ipmiview/account.properties -rwxrwxrwx 1 root root 0 okt 25 14:40 /opt/ipmiview/email.properties -rwxrwxrwx 1 root root 37 okt 25 14:40 /opt/ipmiview/IPMIView.properties -rwxrwxrwx 1 root root 0 okt 25 14:40 /opt/ipmiview/timeout.properties
The config will be untouched when uninstalling ipmiview
,
so you won't lose your precious list of hosts.
2016-10-18 - golang / statically linked
So, Go binaries are supposed to be statically linked.
That's nice if you run inside cut-down environments where not even
libc
is available. But sometimes they use shared libraries
anyway?
TL;DR: Use CGO_ENABLED=0
or -tags netgo
to
create a static executable.
Take this example:
$ go version go version go1.6.2 linux/amd64 $ go build gocollect.go $ file gocollect gocollect: ELF 64-bit LSB executable, x86-64, version 1 (SYSV), dynamically linked, \ interpreter /lib64/ld-linux-x86-64.so.2, not stripped $ ldd gocollect linux-vdso.so.1 => (0x00007ffe105d8000) libpthread.so.0 => /lib/x86_64-linux-gnu/libpthread.so.0 (0x00007f37e3e10000) libc.so.6 => /lib/x86_64-linux-gnu/libc.so.6 (0x00007f37e3a47000) /lib64/ld-linux-x86-64.so.2 (0x0000560cb6ecb000)
That's not static, is it?
But a minimalistic Go file is:
$ cat >>example.go <<EOF package main func main() { println("hi") } EOF $ go build example.go $ file example example: ELF 64-bit LSB executable, x86-64, version 1 (SYSV), statically linked, \ not stripped
Then, why is my gocollect binary not static?
Turns out this is caused by one of the imports. In this case "log/syslog"
but others have reported
the "net"
import to be the cause — which makes perfect sense if
the "log/syslog" package imports "net".
Apparently this was changed between Go 1.3 and 1.4: the "net" stuff brings in a dependency on "cgo" which in turns causes the dynamic linking.
However, that dependency is not strictly needed (*) and can be disabled with Go 1.6
using the CGO_ENABLED=0
environment variable.
(*) The "net" package can use its internal DNS resolver or it can use a cgo-based one
that calls C library routines. This can be useful if you use features of
nsswitch.conf(5)
, but often you don't and you just want /etc/hosts
lookups and then DNS queries to the resolvers found in /etc/resolv.conf
.
See the documentation at Name Resolution in golang
net.
For the purpose of getting familiar with statically versus dynamically linked binaries in Go, here's a testcase that lists a few options.
cgo.go
— must be linked dynamically
package main // #include <stdlib.h> // int fortytwo() // { // return abs(-42); // } import "C" import "fmt" func main() { fmt.Printf("Hello %d!\n", C.fortytwo()) }
fmt.go
— defaults to static
package main import "fmt" func main() { fmt.Printf("Hello %d!\n", 42); }
log-syslog.go
— pulls in "net" and defaults to dynamic
package main import "log" import "log/syslog" func main() { _, _ = syslog.NewLogger(syslog.LOG_DAEMON | syslog.LOG_INFO, 0) log.Printf("Hello %d!\n", 42) }
Combining the above into a nice little Makefile
.
(It uses .RECIPEPREFIX
available in GNU make 3.82
and later only. If you don't have that, run s/^+ /\t/g
.)
.RECIPEPREFIX = + BINS = cgo fmt log-syslog ALL_DEFAULT = $(addsuffix .auto,$(BINS)) cgo.auto ALL_CGO0 = $(exclude cgo.cgo0,$(addsuffix .cgo0,$(BINS))) # <-- no cgo.cgo0 ALL_CGO1 = $(addsuffix .cgo1,$(BINS)) cgo.cgo1 ALL_DYN = $(addsuffix .dyn,$(BINS)) cgo.dyn ALL_NETGO = $(addsuffix .netgo,$(BINS)) cgo.netgo ALL = $(ALL_DEFAULT) $(ALL_CGO0) $(ALL_CGO1) $(ALL_DYN) $(ALL_NETGO) FMT = %-6s %-16s %s .PHONY: all clean all: $(ALL) + @echo + @printf ' $(FMT)\n' 'Type' 'Static' 'Dynamic' + @printf ' $(FMT)\n' '----' '------' '-------' + @for x in auto cgo0 cgo1 dyn netgo; do \ + dyn=`for y in *.$$x; do ldd $$y | grep -q '=>' && \ + echo $${y%.*}; done`; \ + sta=`for y in *.$$x; do ldd $$y | grep -q '=>' || \ + echo $${y%.*}; done`; \ + printf ' $(FMT)\n' $$x "`echo $${sta:--}`" \ + "`echo $${dyn:--}`"; \ + done + @echo clean: + $(RM) $(ALL) %.auto: %.go + go build $< && x=$< && mv $${x%%.go} $@ %.cgo0: %.go + CGO_ENABLED=0 go build $< && x=$< && mv $${x%%.go} $@ %.cgo1: %.go + CGO_ENABLED=1 go build $< && x=$< && mv $${x%%.go} $@ %.dyn: %.go + go build -ldflags -linkmode=external $< && x=$< && mv $${x%%.go} $@ %.netgo: %.go + go build -tags netgo $< && x=$< && mv $${x%%.go} $@
You'll notice how I removed cgo.cgo0
from ALL_CGO0
because it will refuse to build.
Running make
builds the files with a couple of different options and
reports their type:
$ make ... Type Static Dynamic ---- ------ ------- auto fmt cgo log-syslog cgo0 fmt log-syslog - cgo1 fmt cgo log-syslog dyn - cgo fmt log-syslog netgo fmt log-syslog cgo
This table clarifies a couple of things:
- you get static executables unless there is a need to make them dynamic;
- you force dynamic executables with
-ldflags -linkmode=external
; CGO_ENABLED=0
will disable cgo-support, making a static binary more likely;-tags netgo
will disable netcgo-support, making a static binary more likely.
I didn't find any Go 1.6 toggle to force the creation of static binaries, but using one of the two options above is good enough for me.
2016-10-05 - sipp / travis / osx build / openssl
A couple of days ago our SIPp Travis CI builds started failing due to missing OpenSSL include files.
Between SIPp build 196 and SIPp build 215 the OSX builds on Travis started failing with the following configure error:
checking openssl/bio.h usability... no checking openssl/bio.h presence... no checking for openssl/bio.h... no configure: error: <openssl/bio.h> header missing
It turns out that something had changed in the build environment and OpenSSL headers and libraries were no longer reachable.
After scouring the internet for clues,
it was brew link openssl --force
that came with
the resolution:
$ brew link openssl --force Warning: Refusing to link: openssl Linking keg-only openssl means you may end up linking against the insecure, deprecated system OpenSSL while using the headers from Homebrew's openssl. Instead, pass the full include/library paths to your compiler e.g.: -I/usr/local/opt/openssl/include -L/usr/local/opt/openssl/lib
The fix was to add appropriate CPPFLAGS
and LDFLAGS
:
--- a/.travis.yml +++ b/.travis.yml @@ -59,7 +59,12 @@ before_script: - autoreconf -vifs - if [ "$TRAVIS_OS_NAME" = osx ]; then brew update; fi - if [ "$TRAVIS_OS_NAME" = osx ]; then brew install gsl; fi - - ./configure $CONFOPTS + # 2016-10: Apple doesn't include openssl any more because of security + # problems openssl had. Manually specify path to includes/libs. + - if [ "$TRAVIS_OS_NAME" = osx ]; then brew install openssl; fi + - if [ "$TRAVIS_OS_NAME" = osx ]; then CPPFLAGS="-I/usr/local/opt/openssl/include"; fi + - if [ "$TRAVIS_OS_NAME" = osx ]; then LDFLAGS="-L/usr/local/opt/openssl/lib"; fi + - ./configure CPPFLAGS=$CPPFLAGS LDFLAGS=$LDFLAGS $CONFOPTS script: - make -j2
All's well that ends well.
2016-09-14 - lxc / create image / debian squeeze
I'm quite happy with our LXC environment on which I've got various
Debian and Ubuntu build VMs so I can package backports and
other fixes into nice .deb
packages.
Today I needed an old Debian/Squeeze machine to build backports on.
Step one: check the lists.
$ lxc remote list +-----------------+------------------------------------------+---------------+--------+--------+ | NAME | URL | PROTOCOL | PUBLIC | STATIC | +-----------------+------------------------------------------+---------------+--------+--------+ | images | https://images.linuxcontainers.org | simplestreams | YES | NO | +-----------------+------------------------------------------+---------------+--------+--------+ | local (default) | unix:// | lxd | NO | YES | +-----------------+------------------------------------------+---------------+--------+--------+ | ubuntu | https://cloud-images.ubuntu.com/releases | simplestreams | YES | YES | +-----------------+------------------------------------------+---------------+--------+--------+ | ubuntu-daily | https://cloud-images.ubuntu.com/daily | simplestreams | YES | YES | +-----------------+------------------------------------------+---------------+--------+--------+ $ lxc image list images: +---------------------------------+--------------+--------+-----------------------------------------+---------+----------+-------------------------------+ | ALIAS | FINGERPRINT | PUBLIC | DESCRIPTION | ARCH | SIZE | UPLOAD DATE | +---------------------------------+--------------+--------+-----------------------------------------+---------+----------+-------------------------------+ | alpine/3.1 (3 more) | e63bc8abc9cf | yes | Alpine 3.1 amd64 (20160914_17:50) | x86_64 | 2.32MB | Sep 14, 2016 at 12:00am (UTC) | +---------------------------------+--------------+--------+-----------------------------------------+---------+----------+-------------------------------+ ...
Lots of images. Any Squeeze in there?
$ lxc image list images: | grep -i squeeze $ lxc image list local: | grep -i squeeze $ lxc image list ubuntu: | grep -i squeeze $ lxc image list ubuntu-daily: | grep -i squeeze
Nope. And the internet wasn't too helpful either.
So, on to build a custom LXC image.
The Ubuntu Insight website was pretty helpful with the Manually building an image topic:
- Generate a container filesystem. This entirely depends on the distribution you’re using. For Ubuntu and Debian, it would be by using debootstrap.
- Configure anything that’s needed for the distribution to work properly in a container (if anything is needed).
- Make a tarball of that container filesystem, optionally compress it.
- Write a new metadata.yaml file based on the one described above.
- Create another tarball containing that metadata.yaml file.
- Import those two tarballs as a LXD image with:
lxc image import <metadata tarball> <rootfs tarball> --alias some-name
Apparently that first step is automated by the scripts in the LXC repository:
$ ls lxc/templates lxc/templates$ ls *.in lxc-alpine.in lxc-cirros.in lxc-openmandriva.in lxc-slackware.in Makefile.in lxc-altlinux.in lxc-debian.in lxc-opensuse.in lxc-sparclinux.in lxc-archlinux.in lxc-download.in lxc-oracle.in lxc-sshd.in lxc-busybox.in lxc-fedora.in lxc-plamo.in lxc-ubuntu-cloud.in lxc-centos.in lxc-gentoo.in lxc-pld.in lxc-ubuntu.in
Try it:
$ sudo bash lxc-debian.in --path=/home/walter/debian-squeeze \ --arch=amd64 --release=squeeze \ --mirror=http://archive.debian.org/debian \ --security-mirror=http://archive.debian.org/debian debootstrap is /usr/sbin/debootstrap Invalid release squeeze, valid ones are: wheezy jessie stretch sid
I'm sure we can hack that a bit (I used git checkout 13dbc780
), by:
- fixing the paths (normally done by automake),
- adding 'squeeze' in the appropriate places,
- work around an issue where the nics count is not found.
--- lxc-debian.in 2016-09-14 13:43:07.739541126 +0200 +++ lxc-debian 2016-09-15 11:42:22.645638673 +0200 @@ -36,8 +36,8 @@ export GREP_OPTIONS="" MIRROR=${MIRROR:-http://httpredir.debian.org/debian} SECURITY_MIRROR=${SECURITY_MIRROR:-http://security.debian.org/} -LOCALSTATEDIR="@LOCALSTATEDIR@" -LXC_TEMPLATE_CONFIG="@LXCTEMPLATECONFIG@" +LOCALSTATEDIR="/tmp/lxc-create-var" +LXC_TEMPLATE_CONFIG="/tmp/lxc-create-config" # Allows the lxc-cache directory to be set by environment variable LXC_CACHE_PATH=${LXC_CACHE_PATH:-"$LOCALSTATEDIR/cache/lxc"} @@ -314,7 +314,7 @@ cleanup() download_debian() { case "$release" in - wheezy) + squeeze|wheezy) init=sysvinit ;; *) @@ -490,6 +490,7 @@ copy_configuration() # if there is exactly one veth network entry, make sure it has an # associated hwaddr. nics=$(grep -ce '^lxc\.network\.type[ \t]*=[ \t]*veth' "$path/config") + nics=0 if [ "$nics" -eq 1 ]; then grep -q "^lxc.network.hwaddr" "$path/config" || sed -i -e "/^lxc\.network\.type[ \t]*=[ \t]*veth/a lxc.network.hwaddr = 00:16:3e:$(openssl rand -hex 3| sed 's/\(..\)/\1:/g; s/.$//')" "$path/config" fi @@ -753,7 +754,7 @@ fi current_release=$(wget "${MIRROR}/dists/stable/Release" -O - 2> /dev/null | head |awk '/^Codename: (.*)$/ { print $2; }') release=${release:-${current_release}} -valid_releases=('wheezy' 'jessie' 'stretch' 'sid') +valid_releases=('squeeze' 'wheezy' 'jessie' 'stretch' 'sid') if [[ ! "${valid_releases[*]}" =~ (^|[^[:alpha:]])$release([^[:alpha:]]|$) ]]; then echo "Invalid release ${release}, valid ones are: ${valid_releases[*]}" exit 1
Run it:
$ sudo bash lxc-debian --path=/home/walter/debian-squeeze \ --arch=amd64 --release=squeeze \ --mirror=http://archive.debian.org/debian \ --security-mirror=http://archive.debian.org/debian ... $ cd /home/walter/debian-squeeze $ ls -l total 13 -rw-r--r-- 1 root root 165 sep 14 14:04 config drwxr-xr-x 20 root root 21 sep 14 14:09 rootfs
This needs a metadata.yaml
as described (on the same page mentioned earlier)
under Image
metadata.
The following is enough:
architecture: "amd64" creation_date: 1473854884 properties: architecture: "amd64" description: "Debian GNU/Linux 6.0 (squeeze)" os: "debian" release: "squeeze"
Tar them together into a package and import it:
$ sudo tar zcf ../debian-squeeze.tar.gz metadata.yaml rootfs $ lxc image import ../debian-squeeze.tar.gz Transferring image: 100% Image imported with fingerprint: 9bcaba951c4efabbd5ee32f74f01df567d7cb1c725cc67512283b0bb88ea7a91 $ lxc image alias create debian/squeeze 9bcaba951c4efabbd5ee32f74f01df567d7cb1c725cc67512283b0bb88ea7a91
Nice, it works. Do we have an image? Yes we do.
$ lxc image list local: | grep -i squeeze | debian/squeeze | 9bcaba951c4e | no | Debian GNU/Linux 6.0 (squeeze) | x86_64 | 112.52MB | Sep 14, 2016 at 12:10pm (UTC) | $ lxc launch debian/squeeze squeeze-builder $ lxc exec squeeze-builder /bin/bash root@squeeze-builder:~# cat /etc/issue.net Debian GNU/Linux 6.0
It's a bit bare, but it works. Some tweaks to
/etc/hosts
and /etc/hostname
seemed enough. Time to get building!
2016-07-09 - letsencrypt / license update / show differences
This morning, Let's Encrypt e-mailed me that the Subscriber Agreement was updated; but it had no diff.
Let's Encrypt Subscriber,
We're writing to let you know that we are updating the Let's Encrypt Subscriber Agreement, effective August 1, 2016. You can find the updated agreement (v1.1.1) as well as the current agreement (v1.0.1) in the "Let's Encrypt Subscriber Agreement" section of the following page:
https://letsencrypt.org/repository/
Thank you for helping to secure the Web by using Let's Encrypt.
- The Let's Encrypt Team
Let's get one thing clear: I love Let's Encrypt!
Before Let's Encrypt, there was a huge penalty to making your sites safer. If you didn't have any money to spend, you could make your site safer than having no certificate, by using a self-signed certificate, but you'd be scaring off your visitors because the browsers complain loudly; a disturbing paradox.
(I'm ignoring the few little known certificate brokers that handed out a limited set of free certificates here, because they, well... did I mention the word limited?)
Dear Let's Encrypt,
Thank you for your service. It is 2016+ worthy, even more so since you've decoupled the service from the vanilla letsencrypt application.
What is not 2016+ worthy, is your license update.
A license update should come with a diff, so we can see what has changed and what has not.
So, for your enjoyment and mine, I took the texts from the before and after PDF files and reindented them so they can be properly compared.
Fetch them here:
Let's Encrypt Subscriber Agreement - July 2015
(view)
Let's Encrypt Subscriber Agreement - August 2016
(view)
Compare the two using a tool like vimdiff
.
Or — for less readability — check out the
unified diff
(view).
2016-03-24 - apt / insufficiently signed / weak digest
When adding our own apt repository to a new Ubuntu/Xenial machine, I got a "insufficiently signed (weak digest)" error.
# apt-get update ... W: gpgv:/var/lib/apt/lists/partial/ppa.osso.nl_ubuntu_dists_xenial_InRelease: The repository is insufficiently signed by key 4D1...0F5 (weak digest)
Confirmed it with gpgv
.
# gpgv --keyring /etc/apt/trusted.gpg \ /var/lib/apt/lists/ppa.osso.nl_ubuntu_dists_xenial_InRelease gpgv: Signature made Wed 23 Mar 2016 10:14:48 AM UTC using RSA key ID B36530F5 gpgv: Good signature from "PPA-OSSO-NL <support+ppa@osso.nl>" # gpgv --weak-digest sha1 --verbose --keyring /etc/apt/trusted.gpg \ /var/lib/apt/lists/ppa.osso.nl_ubuntu_dists_xenial_InRelease gpgv: armor header: Hash: SHA1 gpgv: armor header: Version: GnuPG v1.4.11 (GNU/Linux) gpgv: original file name='' gpgv: Signature made Wed 23 Mar 2016 10:14:48 AM UTC using RSA key ID B36530F5 gpgv: Note: signatures using the SHA1 algorithm are rejected gpgv: Can't check signature: unknown digest algorithm
Indeed, SHA1.
We'll need to enforce a newer digest on the
reprepro
repository server:
reprepro# cat >> ~/.gnupg/gpg.conf << EOF # Prefer better digests for signing. personal-digest-preferences SHA512 SHA384 SHA256 SHA224 EOF
Regenerate the release files with updated signatures:
reprepro# reprepro export ...
Go back to the user host, and check for success:
# apt-get update ... (no errors) # gpgv --verbose --keyring /etc/apt/trusted.gpg \ /var/lib/apt/lists/ppa.osso.nl_ubuntu_dists_xenial_InRelease gpgv: armor header: Hash: SHA512 gpgv: armor header: Version: GnuPG v1.4.11 (GNU/Linux) gpgv: original file name='' gpgv: Signature made Wed 23 Mar 2016 10:30:04 AM UTC using RSA key ID B36530F5 gpgv: Good signature from "PPA-OSSO-NL <support+ppa@osso.nl>" gpgv: textmode signature, digest algorithm SHA512
Excellent. SHA512 this time, and no complaints from apt.
Update 2016-06-12
Fixed typo after feedback from Simon Leinen at SWITCH. He also remarked the following:
Some people have a gnupghome +b/gpg
(basedir + "/gpg") setting in their
~reprepro/conf/options
. If that's the case, the
personal-digest-preferences
line should go into
~reprepro/conf/gpg.conf
instead of ~/.gnupg/gpg.conf
.
2016-03-23 - lxcfs - proc uptime
When removing the excess LXC and LXD package from the LXC guest and
working around Ubuntu/Xenial
reboot issues I noticed the lxcfs
mounts on my LXC
guest.
(No, you don't need the lxcfs
package on the guest.)
guest# mount | grep lxc lxcfs on /proc/cpuinfo type fuse.lxcfs (rw,nosuid,nodev,relatime,user_id=0,group_id=0,allow_other) lxcfs on /proc/diskstats type fuse.lxcfs (rw,nosuid,nodev,relatime,user_id=0,group_id=0,allow_other) lxcfs on /proc/meminfo type fuse.lxcfs (rw,nosuid,nodev,relatime,user_id=0,group_id=0,allow_other) lxcfs on /proc/stat type fuse.lxcfs (rw,nosuid,nodev,relatime,user_id=0,group_id=0,allow_other) lxcfs on /proc/swaps type fuse.lxcfs (rw,nosuid,nodev,relatime,user_id=0,group_id=0,allow_other) lxcfs on /var/lib/lxcfs type fuse.lxcfs (rw,nosuid,nodev,relatime,user_id=0,group_id=0,allow_other)
Apparently we're really looking at a subset of proc from inside the LXC guest, but then some files have been patched with bind mounts.
For instance, the uptime file:
guest# cat /proc/uptime 78.0 75.0 guest# uptime 07:57:45 up 1 min, 0 users, load average: 0.22, 0.25, 0.14 guest# mount | grep uptime lxcfs on /proc/uptime type fuse.lxcfs (rw,nosuid,nodev,relatime,user_id=0,group_id=0,allow_other) guest# umount /proc/uptime
At this point, the host OS uptime is unshadowed again.
guest# cat /proc/uptime 571190.55 8953464.60 guest# uptime 07:57:57 up 6 days, 14:39, 0 users, load average: 0.17, 0.24, 0.13 guest# cat /var/lib/lxcfs/proc/uptime 121.0 118.0
Note that I don't know how to mount it again though:
guest# mount -n --bind /var/lib/lxcfs/proc/uptime /proc/uptime mount: mount /var/lib/lxcfs/proc/uptime on /proc/uptime failed: Permission denied lxd# tail -n1 /var/log/syslog Mar 23 09:49:24 dev kernel: [574233.745082] audit: type=1400 audit(1458722964.456:246): apparmor="DENIED" operation="mount" info="failed type match" error=-13 profile="lxd-guest_</var/lib/lxd>" name="/proc/uptime" pid=10938 comm="mount" srcname="/var/lib/lxcfs/proc/uptime" flags="rw, bind"
But it's back after a reboot; good enough for me, for now.
2016-03-22 - lxc - ubuntu xenial - reboot
The current Ubuntu/Xenial guest image on our new LXD container host contained too many packages.
It held the lxd
package and a bunch of lxc
packages. They are not needed on the container guest.
At some point before or after removing them,
for some reason the ZFS container got unmounted.
This went unnoticed until I tried a reboot:
guest# reboot lxd# lxc exec guest /bin/bash error: Container is not running. lxd# lxc start guest error: Error calling 'lxd forkstart guest /var/lib/lxd/containers /var/log/lxd/guest/lxc.conf': err='exit status 1' Try `lxc info --show-log guest` for more info lxd# lxc info --show-log guest ... lxc 20160323093357.449 ERROR lxc_conf - conf.c:mount_rootfs:806 - No such file or directory - failed to get real path for '/var/lib/lxd/containers/guest/rootfs' lxd# zfs list -o name,mounted | grep guest data/containers/guest no lxd# zfs mount data/containers/guest
Now the guest was able to start again.
Next up, the Ubuntu/Xenial refused to reboot because the dhclient
would
hang during shutdown. A quick workaround was to append this last line to
/etc/network/interfaces.d/eth0.cfg
:
# The primary network interface auto eth0 iface eth0 inet dhcp pre-down /usr/bin/pkill -9 dhclient
Now the reboot worked as intended.
Interestingly, when trying to reproduce both issues on the 23rd of March, I couldn't with the latest Ubuntu/Xenial image. It looks like at least one of the issues got fixed. Good job, LXC people!
2016-03-21 - renaming / lxd managed lxc container
Renaming an LXD managed LXC container is not straight forward. But if you want to rename the host from inside the container, you should do so on the outside as well. If you don't, you may notice that for instance the DHCP manual IP address assignment doesn't work as expected.
Creating a new LXC container
For example, we'll create a new container called walter-old
with
a fresh Debian/Jessie on it.
lxd# lxc image list images: | grep debian/.*amd64 | debian/jessie/amd64 (1 more) | 44d03be949e5 | yes | Debian jessie (amd64) (20160320_22:42) | x86_64 | 104.27MB | Mar 20, 2016 at 11:23pm (UTC) | | debian/sid/amd64 (1 more) | 83eaa940759c | yes | Debian sid (amd64) (20160320_22:42) | x86_64 | 109.47MB | Mar 20, 2016 at 11:43pm (UTC) | | debian/stretch/amd64 (1 more) | 692d8e094ec1 | yes | Debian stretch (amd64) (20160320_22:42) | x86_64 | 109.08MB | Mar 20, 2016 at 11:33pm (UTC) | | debian/wheezy/amd64 (1 more) | 427b19b85622 | yes | Debian wheezy (amd64) (20160320_22:42) | x86_64 | 98.47MB | Mar 20, 2016 at 11:17pm (UTC) |
lxd# lxc launch images:debian/jessie/amd64 walter-old Creating walter-old Retrieving image: 100% Starting walter-old
lxd# lxc exec walter-old /bin/bash walter-old# hostname walter-old
Very well. Let's assign a fixed IP address to it.
lxd# echo 'dhcp-host=walter-old,10.11.12.13' >> /etc/lxc/dnsmasq.conf lxd# systemctl restart lxc-net lxd# lxc stop walter-old lxd# lxc start walter-old lxd# lxc list | grep walter-old | walter-old | RUNNING | 10.11.12.13 (eth0) | PERSISTENT |
Awesome. But now, rename the container and watch it get the wrong IP.
Rename the container from inside
lxd# lxc exec walter-old /bin/bash walter-old# hostname walter-new walter-old# echo walter-new > /etc/hostname walter-old# bash walter-new# reboot
lxd# lxc list | grep walter-old | walter-old | RUNNING | 10.11.12.99 (eth0) | PERSISTENT |
Drat! The wrong IP. And we don't want to put walter-new
in
/etc/lxc/dnsmasq.conf
without renaming the container.
Renaming the container from outside
Stop the container and rename it by moving it on localhost:
lxd# lxc stop walter-old lxd# lxc list | grep walter-old | walter-old | STOPPED | | PERSISTENT | lxd# lxc move walter-old walter-new
Rename the host in the dnsmasq.conf
:
lxd# sed -i -e 's/walter-old/walter-new/' /etc/lxc/dnsmasq.conf lxd# systemctl restart lxc-net
That should do it. Start it.
lxd# lxc start walter-new lxd# lxc list | grep walter-new | walter-new | RUNNING | 10.11.12.13 (eth0) | PERSISTENT |
Behind the scenes
Behind the scenes, lxc move
takes care of the details.
Moving things on localhost could be done manually as well.
It would look somewhat like this:
lxd# find /var/lib/lxd/ -name 'walter-old*' /var/lib/lxd/containers/walter-old /var/lib/lxd/containers/walter-old.zfs /var/lib/lxd/shmounts/walter-old /var/lib/lxd/devices/walter-old /var/lib/lxd/security/seccomp/walter-old
lxd# ls -l /var/lib/lxd/containers | grep walter-old lrwxrwxrwx 1 root root 38 Mar 21 13:29 walter-old -> /var/lib/lxd/containers/walter-old.zfs drwxr-xr-x 4 100000 100000 5 Mar 21 13:29 walter-old.zfs
Start by renaming and moving the ZFS filesystem.
lxd# zfs get mountpoint data/containers/walter-old NAME PROPERTY VALUE SOURCE data/containers/walter-old mountpoint /var/lib/lxd/containers/walter-old.zfs local lxd# umount /var/lib/lxd/containers/walter-old.zfs lxd# mv /var/lib/lxd/containers/walter-{old,new}.zfs lxd# zfs rename data/containers/walter-{old,new} lxd# zfs set mountpoint=/var/lib/lxd/containers/walter-new.zfs data/containers/walter-new lxd# zfs mount data/containers/walter-new lxd# mount | grep walter-new data/containers/walter-new on /var/lib/lxd/containers/walter-new.zfs type zfs (rw,relatime,xattr,noacl)
lxd# rm /var/lib/lxd/containers/walter-old lxd# ln -s /var/lib/lxd/containers/walter-new{.zfs,} lxd# ls -lda /var/lib/lxd/containers/walter-new{.zfs,} lrwxrwxrwx 1 root root 38 Mar 21 13:40 /var/lib/lxd/containers/walter-new -> /var/lib/lxd/containers/walter-new.zfs drwxr-xr-x 4 100000 100000 5 Mar 21 13:40 /var/lib/lxd/containers/walter-new.zfs
Next, a few more references:
lxd# find /var/lib/lxd/ -name 'walter-old*' /var/lib/lxd/shmounts/walter-old /var/lib/lxd/devices/walter-old /var/lib/lxd/security/seccomp/walter-old lxd# mv /var/lib/lxd/shmounts/walter-{old,new} lxd# mv /var/lib/lxd/devices/walter-{old,new} lxd# mv /var/lib/lxd/security/seccomp/walter-{old,new}
Lastly, the LXD database:
lxd# sqlite3 /var/lib/lxd/lxd.db sqlite> select * from containers where name = 'walter-old'; 14|walter-old|2|0|0|0|1458563364 sqlite> update containers set name = 'walter-new' where name = 'walter-old'; sqlite> ^D
2016-02-21 - missing sofiles / linker / asterisk / pjsip
When compiling Asterisk with a PJProject debianized using the debian/ directory to Ubuntu/Trusty, I got the following compile error:
$ gcc -o chan_pjsip.so -pthread -shared -Wl,--version-script,chan_pjsip.exports,--warn-common \ chan_pjsip.o pjsip/dialplan_functions.o -lpjsua2 -lstdc++ -lpjsua -lpjsip-ua \ -lpjsip-simple -lpjsip -lpjmedia-codec -lpjmedia-videodev -lpjmedia-audiodev \ -lpjmedia -lpjnath -lpjlib-util -lsrtp -lpj -lm -lrt -lpthread \ -lSDL2 -lavformat -lavcodec -lswscale -lavutil -lv4l2 -lopencore-amrnb \ -lopencore-amrwb /usr/bin/ld: cannot find -lSDL2 /usr/bin/ld: cannot find -lavformat /usr/bin/ld: cannot find -lavcodec /usr/bin/ld: cannot find -lswscale /usr/bin/ld: cannot find -lavutil /usr/bin/ld: cannot find -lv4l2 /usr/bin/ld: cannot find -lopencore-amrnb /usr/bin/ld: cannot find -lopencore-amrwb collect2: error: ld returned 1 exit status
That's odd. I have those libs installed.
Why is there no versionless .so
variant?
$ dpkg -L libsdl2-2.0-0 | grep /lib/ /usr/lib/x86_64-linux-gnu /usr/lib/x86_64-linux-gnu/libSDL2-2.0.so.0.2.0 /usr/lib/x86_64-linux-gnu/libSDL2-2.0.so.0
No .so
symlink in that package?
We can make the build succeed by doing the following:
$ cd /usr/lib/x86_64-linux-gnu $ sudo ln -s libSDL2-2.0.so.0.2.0 libSDL2.so
That fixed the first error, and now the other ones:
$ for x in avformat avcodec swscale v4l2 avutil opencore-amrnb opencore-amrwb do test -f lib$x.so || sudo ln -s `readlink lib$x.so.* | sort | tail -n1` lib$x.so done
At this point, you must re-run configure on Asterisk, because some configure autodetection tests may have failed on the same issue.
$ ./configure --enable-dev-mode $ make ... finished successfully ...
A bug for this issue was filed here: debian bug #804460
But... that is only half of the story.
There shouldn't be any .so
files in the the regular
library package. They are in the -dev
package:
$ dpkg -L libavutil52 | grep /lib/.*so /usr/lib/x86_64-linux-gnu/libavutil.so.52.3.0 /usr/lib/x86_64-linux-gnu/libavutil.so.52 $ dpkg -L libavutil-dev | grep /lib/.*so /usr/lib/x86_64-linux-gnu/libavutil.so
Why is that?
The Debian packaging manual: 8.4 Development files has this to say:
If there are development files associated with a shared library, the source package needs to generate a binary development package named [...] libraryname-dev. [...]
The development package should contain a symlink for the associated shared library without a version number. For example, the libgdbm-dev package should include a symlink from /usr/lib/libgdbm.so to libgdbm.so.3.0.0. This symlink is needed by the linker (ld) when compiling packages, as it will only look for libgdbm.so when compiling dynamically.
As you know, a binary X is generally linked against a shared library libY.so.MAJOR. Like this:
$ ldd /bin/ls | grep libacl libacl.so.1 => /lib/x86_64-linux-gnu/libacl.so.1 (0x00007f23466cd000)
But, I don't have version 1, I have version 1.1.0.
$ readlink /lib/x86_64-linux-gnu/libacl.so.1 libacl.so.1.1.0
And when ls
was linked, it wasn't against libacl.so.1, but against libacl.so —
without version:
$ gcc ... -lacl ...
This works, not because ld
knows anything about taking only
the MAJOR version, but because that version was built into the so
file:
$ objdump -p /lib/x86_64-linux-gnu/libacl.so.1.1.0 | grep SONAME SONAME libacl.so.1
That explains the versions and why we do need
the libacl.so.1
symlink in the non-dev package,
but we don't need libacl.so
there.
We only need that when developing, and then we get it from the
the -dev
package.
As for the bug at hand: I believe this is a misconfiguration in Asterisk or PJProject and should be fixed there. The pjproject libs may be linked to other libraries, but that doesn't mean we (Asterisk) needs to link to those too.
In fact, the problems lies with Asterisk calling pkg-config --libs pjproject
and pjproject listing its own used libraries in the public Libs section (instead
of the private one).
$ grep ^Libs /usr/lib/x86_64-linux-gnu/pkgconfig/libpjproject.pc Libs: -L${libdir} -lpjsua2 -lstdc++ -lpjsua -lpjsip-ua -lpjsip-simple -lpjsip -lpjmedia-codec -lpjmedia -lpjmedia-videodev -lpjmedia-audiodev -lpjmedia -lpjnath -lpjlib-util -lsrtp -lpj -lm -lrt -lpthread -L/usr/lib/x86_64-linux-gnu -lSDL2 -lavformat -lavcodec -lswscale -lavutil -lv4l2 -lopencore-amrnb -lopencore-amrwb
Unless I'm mistaken, only the -lpj*
should be in there. The others
shouldn't, or be in the Libs.private
section instead, as this
libtiff example shows:
$ grep ^Libs /usr/lib/x86_64-linux-gnu/pkgconfig/libtiff-4.pc Libs: -L${libdir} -ltiff Libs.private: -llzma -ljbig -ljpeg -lz
Or, Asterisk shouldn't use the pkg-config values, hauling in lots of unnecessary dependencies. Or both ;-)
Update 2016-08-16
Harm Geerts pointed me to this clear explanation about "Overlinking" when I ran into this issue again.
This time the issue reared its head during Asterisk configure time
when it decided that the compilation of the HAVE_PJ_TRANSACTION_GRP_LOCK
configure test failed, not because pjsip_tsx_create_uac2
did not exist, but because the "overlinked" dependencies — like -lSDL2 —
did not exist on my build system.
When the test incorrectly assumed that HAVE_PJ_TRANSACTION_GRP_LOCK
was false,
the compilation failed in res/res_pjsip/pjsip_distributor.c
where old-style
pj_mutex_unlock(tsx->mutex)
was called, instead of newer
pj_grp_lock_release(tsx->grp_lock)
.
Update 2016-08-31
An updated pjproject.pc could look like this:
# Package Information for pkg-config prefix=/usr exec_prefix=${prefix} libdir=/usr/lib/x86_64-linux-gnu includedir=/usr/include Name: libpjproject Description: Multimedia communication library URL: http://www.pjsip.org Version: 2.4.5 Libs: -L${libdir} -L/usr/lib/x86_64-linux-gnu -lpjsua2 -lpjsua -lpjsip-ua -lpjsip-simple -lpjsip -lpjmedia-codec -lpjmedia -lpjmedia-videodev -lpjmedia-audiodev -lpjmedia -lpjnath -lpjlib-util -lpj Libs.private: -lstdc++ -lsrtp -lm -lrt -lpthread -lSDL2 -lavformat -lavcodec -lswscale -lavutil -lv4l2 -lopencore-amrnb -lopencore-amrwb Cflags: -I${includedir} -I/usr/include -DPJ_AUTOCONF=1 -DPJ_IS_BIG_ENDIAN=0 -DPJ_IS_LITTLE_ENDIAN=1
2016-02-12 - python / xinetd / virtualenv
So, while developing a server application for a client, my colleague Harm decided it would be a waste of our programming time to add TCP server code.
Inetd and friends can do that really well. The amount of new connects to the server would be minimal, so the overhead of spawning a new Python process for every connect was negligible.
Using xinetd as an inetd server wrapper is simple. The config would look basically like this:
service my_server { ... port = 20001 server = /path/to/virtualenv/bin/python server_args = /path/to/app/server.py ...
Yes! That's right. We can call the python executable from the virtualenv directory and get the right environment without having to call the 'activate' wrapper.
We can? Yes, we can. Check this out:
$ cd /path/to/very-minimal-virtualenv $ ls -l `find . -type f -o -type l` -rwxr-xr-x 1 walter walter 3773512 feb 11 17:08 ./bin/python lrwxrwxrwx 1 walter walter 24 feb 12 08:58 ./lib/python2.7/os.py -> /usr/lib/python2.7/os.py -rw-rw-r - 1 walter walter 0 feb 12 08:57 ./lib/python2.7/site.py -rw-rw-r - 1 walter walter 126 feb 12 09:00 ./lib/python2.7/site.pyc $ ./bin/python -c 'import sys; print sys.prefix' /path/to/very-minimal-virtualenv
Awesome, then we won't need a wrapper that sources ./bin/activate
.
Except... it didn't work when called from xinetd!
The python executable called from xinetd stubbornly decided that sys.prefix
points to /usr
: the wrong Python environment would be loaded.
If we added in any random wrapper application, things would work:
server = /usr/bin/env server_args = /path/to/virtualenv/bin/python /path/to/app/server.py
And this worked:
server = /usr/bin/time server_args = /path/to/virtualenv/bin/python /path/to/app/server.py
And it worked when we explicitly set PYTHONHOME
, like this:
server = /path/to/virtualenv/bin/python server_args = /path/to/app/server.py env = PYTHONHOME=/path/to/virtualenv
So, what was different?
Turns out it was argv[0]
.
The "what prefix should I use" code is found in the CPython sources in
Modules/getpath.c
calculate_path()
:
char *prog = Py_GetProgramName(); ... if (strchr(prog, SEP)) strncpy(progpath, prog, MAXPATHLEN); ... strncpy(argv0_path, progpath, MAXPATHLEN); argv0_path[MAXPATHLEN] = '\0'; ... joinpath(prefix, LANDMARK); if (ismodule(prefix)) return -1; /* we have a valid environment! */ ...
Where Py_GetProgramName()
was initialized by Py_Main()
,
and LANDMARK happens to be os.py
:
Py_SetProgramName(argv[0]);
#ifndef LANDMARK #define LANDMARK "os.py" #endif
And you've guessed it by now: xinetd was kind enough —
thanks dude — to strip
the dirname from the server
before passing it to
execve(2)
. We expected it to call this:
execve("/path/to/virtualenv/bin/python", ["/path/to/virtualenv/bin/python", "/path/to/app/server.py"], [/* 1 var */])
But instead, it called this:
execve("/path/to/virtualenv/bin/python", ["python", "/path/to/app/server.py"], [/* 1 var */])
And that caused Python to lose its capability to find the closest environment.
After figuring that out, the world made sense again.
2016-01-15 - salt master losing children
I recently set up psdiff on a few of my servers as a basic means to monitor process activity.
It disclosed that my SaltStack master daemon — which I'm running as a non-privileged user — was losing a single child, exactly 24 hours after I had ran salt commands. This seemed to be a recurring phenomenon.
The salt server — version 0.17.5+ds-1 on Ubuntu Trusty — was running these processes:
salt 0:00 /usr/bin/salt-master salt 0:22 \_ [salt-master] <defunct> <-- clear_old_jobs_proc salt 0:00 \_ /usr/bin/salt-master <-- start_publisher salt 0:01 \_ /usr/bin/salt-master <-- start_event_publisher salt 0:04 \_ /usr/bin/salt-master <-- worker process 1..5 salt 0:03 \_ /usr/bin/salt-master as configured by the salt 0:04 \_ /usr/bin/salt-master 'worker_threads' setting salt 0:04 \_ /usr/bin/salt-master salt 0:04 \_ /usr/bin/salt-master
So, why did the clear_old_jobs_proc die?
I added some debug code. Observe, the %(process)5s
in the log format
which adds the PID to log messages.
# tail /etc/salt/master.d/* ==> /etc/salt/master.d/salt-log.conf <== log_level: info log_fmt_logfile: '%(asctime)s,%(msecs)03.0f [%(name)-17s][%(levelname)-8s/%(process)5s] %(message)s' ==> /etc/salt/master.d/salt-security.conf <== sign_pub_messages: True ==> /etc/salt/master.d/salt-user.conf <== user: salt
And then a patch against master.py
:
--- /usr/lib/python2.7/dist-packages/salt/master.py.orig 2016-01-14 00:10:41.218945899 +0100 +++ /usr/lib/python2.7/dist-packages/salt/master.py 2016-01-14 00:17:28.520966737 +0100 @@ -178,6 +178,15 @@ class Master(SMaster): SMaster.__init__(self, opts) def _clear_old_jobs(self): + log.warning('entering clear_old_jobs') + try: + self.__clear_old_jobs() + except BaseException as e: + log.exception('clear_old_jobs exception: {}'.format(e)) + raise + log.exception('clear_old_jobs no exception at all') + + def __clear_old_jobs(self): ''' The clean old jobs function is the general passive maintenance process controller for the Salt master. This is where any data that needs to
Lo and behold, I got errors:
2016-01-15 00:17:54,253 [salt.master ][ERROR / 2836] clear_old_jobs exception: [Errno 13] Permission denied: '/var/cache/salt/master/.dfnt' Traceback (most recent call last): File "/usr/lib/python2.7/dist-packages/salt/master.py", line 183, in _clear_old_jobs self.__clear_old_jobs() File "/usr/lib/python2.7/dist-packages/salt/master.py", line 227, in __clear_old_jobs salt.crypt.dropfile(self.opts['cachedir']) File "/usr/lib/python2.7/dist-packages/salt/crypt.py", line 73, in dropfile with salt.utils.fopen(dfnt, 'w+') as fp_: File "/usr/lib/python2.7/dist-packages/salt/utils/__init__.py", line 930, in fopen fhandle = open(*args, **kwargs) IOError: [Errno 13] Permission denied: '/var/cache/salt/master/.dfnt'
That's odd, the salt-master
runs as user salt and
both that file and directory belong to that user. But, the file
only has read permissions. And as non-superuser, trying to
write to my own files will fail if it has no write permissions.
Why does it have no write permissions if we're going to write to it?
So, grab a fresh copy of Salt from github and see if the bug is shallow.
$ python >>> from salt.crypt import dropfile [CRITICAL] Unable to import msgpack or msgpack_pure python modules >>> dropfile('/tmp') ^Z $ ls /tmp/.dfn -la -r-------- 1 walter walter 0 jan 15 10:34 /tmp/.dfn $ fg >>> dropfile('/tmp') Traceback (most recent call last): File "<stdin>", line 1, in <module> File "salt/crypt.py", line 66, in dropfile with salt.utils.fopen(dfn, 'wb+') as fp_: File "salt/utils/__init__.py", line 1209, in fopen fhandle = open(*args, **kwargs) IOError: [Errno 13] Permission denied: '/tmp/.dfn'
Yay for shallow bugs! Double yay for not needing complex setup just to test this single function. Go Salt!
This should fix things:
diff --git a/salt/crypt.py b/salt/crypt.py index 1809c8e..30a73da 100644 --- a/salt/crypt.py +++ b/salt/crypt.py @@ -60,11 +60,11 @@ def dropfile(cachedir, user=None): try: log.info('Rotating AES key') - if (salt.utils.is_windows() and os.path.isfile(dfn) and - not os.access(dfn, os.W_OK)): - os.chmod(dfn, stat.S_IWUSR) + if os.path.isfile(dfn) and not os.access(dfn, os.W_OK): + os.chmod(dfn, stat.S_IRUSR | stat.S_IWUSR) with salt.utils.fopen(dfn, 'wb+') as fp_: fp_.write('') + os.chmod(dfn, stat.S_IRUSR) if user: try: import pwd
Or this patch against the packaged version:
--- /usr/lib/python2.7/dist-packages/salt/crypt.py.orig 2016-01-15 10:49:32.369935604 +0100 +++ /usr/lib/python2.7/dist-packages/salt/crypt.py 2016-01-18 14:10:35.608671989 +0100 @@ -10,9 +10,9 @@ import os import sys import time import hmac -import shutil import hashlib import logging +import stat # Import third party libs try: @@ -38,7 +38,6 @@ def dropfile(cachedir, user=None): ''' Set an aes dropfile to update the publish session key ''' - dfnt = os.path.join(cachedir, '.dfnt') dfn = os.path.join(cachedir, '.dfn') def ready(): @@ -70,18 +69,24 @@ def dropfile(cachedir, user=None): aes = Crypticle.generate_key_string() mask = os.umask(191) - with salt.utils.fopen(dfnt, 'w+') as fp_: - fp_.write(aes) - if user: - try: - import pwd - uid = pwd.getpwnam(user).pw_uid - os.chown(dfnt, uid, -1) - shutil.move(dfnt, dfn) - except (KeyError, ImportError, OSError, IOError): - pass - os.umask(mask) + try: + log.info('Rotating AES key') + + if os.path.isfile(dfn) and not os.access(dfn, os.W_OK): + os.chmod(dfn, stat.S_IRUSR | stat.S_IWUSR) + with salt.utils.fopen(dfn, 'wb+') as fp_: + fp_.write(aes) + os.chmod(dfn, stat.S_IRUSR) + if user: + try: + import pwd + uid = pwd.getpwnam(user).pw_uid + os.chown(dfn, uid, -1) + except (KeyError, ImportError, OSError, IOError): + pass + finally: + os.umask(mask) # restore original umask def gen_keys(keydir, keyname, keysize, user=None):
2016-01-13 - polyglot xhtml
Polyglot XHTML: Serving pages that are valid HTML and valid XML at the same time.
A number of documents have been written on the subject, which I shall not repeat here.
My summary:
- HTML5 is not going away.
- XHTML pages validate in the browser.
- If you can get better validation during the development of your website, then you'll save yourself time and headaches.
- Thus, for your development environment, you'll set the equivalent of this:
DEFAULT_CONTENT_TYPE = 'application/xhtml+xml'
(and a workaround for Django pages if they're out of your control) - But for your production environment, you'll still use
text/html
. - Even though your page is served as html, you can still use XML parsers to do processing on it.
Apparently, the W3C resource about polyglot XHTML has been taken out of maintenance, without an explanation. I figure they figured it's not worth the efforts, as the whatwg wiki states: “You have to be really careful for this to work, and it's almost certainly not worth it. You'd be better off just using an HTML-to-XML parser.”
I think that's an exaggeration. Jesper Tverskov wrote an excellent article called Benefits of polyglot XHTML5 where he summarized how little work it is.
For convenience, I've copied the "polyglot in a nutshell" here:
If you are new to polyglot XHTML, it might seem quite a challenge. But if your document is already valid HTML5, well-formed, and uses lower-case for element and attribute names, you are pretty close. The following 10 points almost cover all of polyglot XHTML:
- Declare namespaces explicitly.
- Use both the
lang
and thexml:lang
attribute.- Use
<meta charset="UTF-8"/>
.- Use
tbody
orthead
ortfoot
in tables.- When
col
element is used in tables, also usecolgroup
.- Don't use
noscript
element.- Don't start
pre
andtextarea
elements with newline.- Use
innerHTML
property instead ofdocument.write()
.- In script element, wrap JavaScript in out-commented CDATA section.
- Many names in SVG and one in MathML use lowerCamelCase.
The following additional rules will be picked up by HTML5 validation if violated:
- Don't use XML Declaration or processing instructions. Don't use xml:space and xml:base except in SVG and MathML.
- Elements that can have content must not be minimized to a single tag element. That is
<br/>
is ok but<p></p>
must be used instead of<p/>
.