351-400 of 10000 results (23ms)
2021-06-16 §
10:34 <hnowlan@cumin1001> START - Cookbook sre.hosts.downtime for 2:00:00 on maps1007.eqiad.wmnet with reason: REIMAGE [production]
10:23 <marostegui@cumin1001> dbctl commit (dc=all): 'Depool db1096:3316', diff saved to https://phabricator.wikimedia.org/P16535 and previous config saved to /var/cache/conftool/dbconfig/20210616-102349-marostegui.json [production]
09:52 <hnowlan@puppetmaster1001> conftool action : set/pooled=no; selector: name=maps1007.eqiad.wmnet [production]
09:51 <hnowlan@cumin1001> END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 3 days, 0:00:00 on maps1007.eqiad.wmnet with reason: Reparenting from maps1009 [production]
09:51 <hnowlan@cumin1001> START - Cookbook sre.hosts.downtime for 3 days, 0:00:00 on maps1007.eqiad.wmnet with reason: Reparenting from maps1009 [production]
09:50 <hnowlan> disabling puppet on maps1* to reparent maps1007 from new master maps1009 [production]
09:47 <kormat> truncating all pc* tables on pc1010 T282761 [production]
09:40 <kormat@deploy1002> Synchronized wmf-config/db-eqiad.php: Repool pc1009 as pc3 primary T282761 (duration: 00m 59s) [production]
09:04 <kormat> Deploying wmfmariadbpy 0.7.1 T284819 [production]
09:04 <kormat> uploaded wmfmariadbpy 0.7.1 to apt.wm.o [production]
08:24 <Amir1> running "update flaggedrevs set fr_quality = 0 where fr_quality != 0;" on all wikis where flagged revs is enabled (T279761) [production]
07:27 <dcausse> cleanup old /var/log/airflow/scheduler logs to reclaim space on an-airflow1001 [production]
06:55 <volans@cumin1001> END (PASS) - Cookbook sre.dns.netbox (exit_code=0) [production]
06:52 <volans@cumin1001> START - Cookbook sre.dns.netbox [production]
05:44 <majavah> restart trafficserver-tls.service on deployment-cache-upload06, was using an expired cert [releng]
05:06 <marostegui> Upgrade clouddb1014 [production]
2021-06-15 §
23:52 <wm-bot> <bd808> Adding #wikimedia-sp bridge (T283308) (try #2) [tools.bridgebot]
23:44 <wm-bot> <bd808> Adding #wikimedia-sp bridge (T283308) [tools.bridgebot]
22:58 <wm-bot> <bd808> Adding #iabot bridge (T285021) [tools.bridgebot]
20:11 <wm-bot> <lucaswerkmeister> deployed 0b6fed0054 (even more optional grammatical features) [tools.lexeme-forms]
19:32 <wm-bot> <lucaswerkmeister> deployed d8eadd1cae (more optional grammatical features) [tools.lexeme-forms]
19:02 <bstorm> cleared error status from a few queues [tools]
18:58 <wm-bot> <lucaswerkmeister> deployed 61a5e0fc18 (optional grammatical features) [tools.lexeme-forms]
17:54 <dancy> testing upcoming Scap release on beta [production]
17:46 <razzi> remove hdfs namenode backup on stat1004 [analytics]
17:45 <razzi> enable puppet on an-launcher [analytics]
17:45 <razzi> sudo -u yarn kerberos-run-command yarn yarn rmadmin -refreshQueues [analytics]
17:21 <mutante> new Wikimedia language "shi" added - Shilha /ˈʃɪlhə/ is a Berber language native to Shilha people. The endonym is Taclḥit /taʃlʜijt/, and in recent English publications the language is often rendered Tashelhiyt or Tashelhit. [production]
17:17 <mutante> new Wikimedia language "dag" added - Dagbani (or Dagbane), also known as Dagbanli and Dagbanle, is a Gur language spoken in Ghana. [production]
17:11 <razzi@cumin1001> END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on an-master1002.eqiad.wmnet with reason: REIMAGE [production]
17:09 <razzi@cumin1001> START - Cookbook sre.hosts.downtime for 2:00:00 on an-master1002.eqiad.wmnet with reason: REIMAGE [production]
16:55 <razzi> sudo -i wmf-auto-reimage-host -p T278423 an-master1002.eqiad.wmnet [analytics]
16:53 <razzi> run uid script on an-master1002 [analytics]
16:33 <elukey> restart hadoop-yarn-resourcemanager on an-master1001 [analytics]
16:31 <bstorm> truncated 26GB error.log T284964 [tools.stimmberechtigung]
16:16 <razzi> sudo systemctl stop 'hadoop-*' on an-master1002 [analytics]
16:15 <majavah> deleting unused shutdown nodes: tools-checker-03 tools-k8s-haproxy-1 tools-k8s-haproxy-2 [tools]
16:14 <razzi> sudo systemctl stop hadoop-* on an-master1001, then realize I meant to do this on an-master1002, so start hadoop-* [analytics]
16:12 <balloons> add 8 CPU/16G RAM to quota T284973 [metricsinfra]
16:11 <razzi> downtime an-master1002 [analytics]
16:11 <razzi@cumin1001> END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 60 days, 0:00:00 on an-master1002.eqiad.wmnet with reason: Update operating system to bullseye [production]
16:11 <razzi@cumin1001> START - Cookbook sre.hosts.downtime for 60 days, 0:00:00 on an-master1002.eqiad.wmnet with reason: Update operating system to bullseye [production]
16:09 <majavah> set toolsbeta-bastion-05 as grid submit host [toolsbeta]
16:08 <bstorm> truncated 28GB person_bkl2.out T284964 [tools.persondata]
15:55 <razzi> sudo transfer.py an-master1001.eqiad.wmnet:/srv/hadoop/backup/hdfs-namenode-snapshot-buster-reimage-2021-06-15.tar.gz stat1004.eqiad.wmnet:/home/razzi/hdfs-namenode-fsimage [analytics]
15:54 <bstorm> truncated 42GB virgule.err file T284964 [tools.robokobot]
15:42 <razzi> tar -czf /srv/hadoop/backup/hdfs-namenode-snapshot-buster-reimage-$(date --iso-8601).tar.gz current on an-master1001 [analytics]
15:38 <razzi> backup /srv/hadoop/name/current to /home/razzi/hdfs-namenode-snapshot-buster-reimage-2021-06-15.tar.gz on an-master1001 [analytics]
15:33 <razzi> sudo -u hdfs kerberos-run-command hdfs hdfs dfsadmin -saveNamespace [analytics]
15:28 <MacFan4000> copied freenode channel config for #wikimedia-fundraising to libera [wm-bot]