301-350 of 10000 results (16ms)
2020-11-23 §
12:27 <Lucas_WMDE> Deployed patch for T260349 [production]
12:25 <marostegui@cumin1001> dbctl commit (dc=all): 'Depool db1121 to clone clouddb1017:3314 clouddb1019:3314 T267090', diff saved to https://phabricator.wikimedia.org/P13366 and previous config saved to /var/cache/conftool/dbconfig/20201123-122549-marostegui.json [production]
12:07 <urbanecm@deploy1001> Synchronized wmf-config/InitialiseSettings.php: c00d7e8e4c407b76aa2930dfa040394e874d77bc: Move ContentTranslation out of Beta for br, ka, ast, si and ig WPs (T267212, T266217, T266218, T266219, T266220) (duration: 01m 06s) [production]
12:01 <Urbanecm> Start of mwscript extensions/AbuseFilter/maintenance/updateVarDumps.php --wiki=$wiki --print-orphaned-records-to=/tmp/urbanecm/$wiki-orphaned.log --progress-markers > $wiki.log in a tmux at mwmaint1002 (wiki=zhwiki; T246539) [production]
11:49 <XioNoX> eqiad row A, split LVS, Ganeti, Cloud, interface-ranges to individual terms [production]
11:38 <jdrewniak@deploy1001> Synchronized portals: Wikimedia Portals Update: [[gerrit:643018| Bumping portals to master (T128546)]] (duration: 01m 05s) [production]
11:37 <jdrewniak@deploy1001> Synchronized portals/wikipedia.org/assets: Wikimedia Portals Update: [[gerrit:643018| Bumping portals to master (T128546)]] (duration: 01m 21s) [production]
11:25 <hnowlan> starting cassandra bootstrap of maps2008 [production]
11:20 <effie> enable puppet on cp* hosts [production]
11:16 <moritzm> installing poppler security updates on stretch [production]
11:13 <elukey@cumin1001> END (FAIL) - Cookbook sre.hosts.decommission (exit_code=99) [production]
11:13 <elukey@cumin1001> START - Cookbook sre.hosts.decommission [production]
11:05 <XioNoX> eqiad row A, standardize interfaces descriptions and ranges order [production]
10:35 <jmm@cumin2001> END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) [production]
10:26 <effie> disable puppet on cp* hosts to merge 641730 [production]
10:26 <jmm@cumin2001> START - Cookbook sre.hosts.reboot-single [production]
10:26 <moritzm> rebooting serpens [production]
10:21 <XioNoX> eqiad row B, split LVS, Ganeti, Cloud, interface-ranges to individual terms [production]
09:48 <XioNoX> eqiad row B, standardize interfaces descriptions and ranges order [production]
08:46 <elukey> drop kerberos keytabs for analytics10[28-41] from krb1001:/srv/kerberos/keytabs, decommed nodes (old hadoop test cluster) [production]
08:43 <godog> start stress testing on ms-be106* - T268435 [production]
08:41 <elukey> drop kerberos principals from krb1001 for analytics10[29-41], decommed nodes (old hadoop test cluster) [production]
08:36 <elukey> drop analytics1028's krb principals from krb1001 - old decommed node [production]
08:35 <moritzm> installing remaining krb5 security updates for Stretch [production]
07:27 <marostegui> Stop MySQL on db1125:3314 to clone clouddb1015 and clouddb1019 - lag will appear on Commosnwiki on wikireplicas - T267090 [production]
07:06 <marostegui@cumin1001> END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) [production]
07:00 <marostegui@cumin1001> START - Cookbook sre.hosts.decommission [production]
06:46 <marostegui> Restart clouddb1013 clouddb1015 clouddb1017 clouddb1019 for testing T267090 [production]
2020-11-21 §
09:18 <joal> Drop historical logs of 'Wikidata Concepts Monitor ETL' on HDFS keeping one example - freeing 60Tb [production]
09:17 <joal> Drop historical logs of ' [production]
08:28 <ariel@deploy1001> Finished deploy [dumps/dumps@1a76a9a]: revinfo updates (duration: 00m 05s) [production]
08:28 <ariel@deploy1001> Started deploy [dumps/dumps@1a76a9a]: revinfo updates [production]
08:10 <elukey> remove big stderrlog fine in /var/lib/hadoop/data/d/yarn/logs/application_1605880843685_1450 on an-worker1110 [production]
08:05 <elukey> remove big stderrlog fine in /var/lib/hadoop/data/e/yarn/logs/application_1605880843685_1450 on an-worker1105 [production]
2020-11-20 §
23:38 <mutante> synced puppet-compiler facts - new hosts should be usable in compiler [production]
22:30 <mutante> cumin1001 - sudo systemctl start cumin-check-aliases -> <+icinga-wm> RECOVERY - Check systemd state on cumin1001 is OK T268369 [production]
21:30 <razzi@cumin1001> END (PASS) - Cookbook sre.ganeti.makevm (exit_code=0) [production]
20:26 <razzi@cumin1001> START - Cookbook sre.ganeti.makevm [production]
20:09 <razzi@cumin1001> END (PASS) - Cookbook sre.ganeti.makevm (exit_code=0) [production]
19:52 <mutante> releases2002 - systemctl disable wmf_auto_restart_rsync; rm /usr/lib/systemd/system/wmf_auto_restart_rsync.* ; systemctl daemon-reload ; systemctl reset-failed - clear up systemd unit that was not absented and fix Icinga alerts [production]
19:45 <mutante> releases2002 systemctl reset-failed (wmf_auto_restart_rsync.service failed but hopefully fixed) [production]
19:39 <mutante> Icinga: ACKing all the "unhandled CRIT" alerts on clouddb* an an-coord* that have disabled notifications to remove monitoring noise. from 72 to 25 active alerts [production]
19:14 <razzi@cumin1001> START - Cookbook sre.ganeti.makevm [production]
18:47 <elukey@cumin1001> END (FAIL) - Cookbook sre.hosts.decommission (exit_code=1) [production]
18:42 <elukey@cumin1001> START - Cookbook sre.hosts.decommission [production]
18:37 <elukey@cumin1001> END (FAIL) - Cookbook sre.hosts.decommission (exit_code=1) [production]
18:36 <razzi@cumin1001> END (PASS) - Cookbook sre.ganeti.makevm (exit_code=0) [production]
18:31 <elukey@cumin1001> START - Cookbook sre.hosts.decommission [production]
18:31 <elukey@cumin1001> END (FAIL) - Cookbook sre.hosts.decommission (exit_code=1) [production]
18:18 <elukey@cumin1001> START - Cookbook sre.hosts.decommission [production]