4001-4050 of 10000 results (35ms)
2020-11-23 §
08:36 <elukey> drop analytics1028's krb principals from krb1001 - old decommed node [production]
08:35 <moritzm> installing remaining krb5 security updates for Stretch [production]
07:27 <marostegui> Stop MySQL on db1125:3314 to clone clouddb1015 and clouddb1019 - lag will appear on Commosnwiki on wikireplicas - T267090 [production]
07:06 <marostegui@cumin1001> END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) [production]
07:00 <marostegui@cumin1001> START - Cookbook sre.hosts.decommission [production]
06:46 <marostegui> Restart clouddb1013 clouddb1015 clouddb1017 clouddb1019 for testing T267090 [production]
2020-11-22 §
17:40 <andrewbogott> apt-get upgrade on cloudservices1003/1004 [admin]
17:32 <andrewbogott> upgrading Designate on cloudservices1003/1004 to Stein [admin]
2020-11-21 §
21:25 <wm-bot> <lucaswerkmeister> deployed 1608cc4dd9 (gender-dependent messages) [tools.lexeme-forms]
09:18 <joal> Drop historical logs of 'Wikidata Concepts Monitor ETL' on HDFS keeping one example - freeing 60Tb [production]
09:17 <joal> Drop historical logs of ' [production]
08:28 <ariel@deploy1001> Finished deploy [dumps/dumps@1a76a9a]: revinfo updates (duration: 00m 05s) [production]
08:28 <ariel@deploy1001> Started deploy [dumps/dumps@1a76a9a]: revinfo updates [production]
08:10 <elukey> remove big stderrlog fine in /var/lib/hadoop/data/d/yarn/logs/application_1605880843685_1450 on an-worker1110 [analytics]
08:10 <elukey> remove big stderrlog fine in /var/lib/hadoop/data/d/yarn/logs/application_1605880843685_1450 on an-worker1110 [production]
08:05 <elukey> remove big stderrlog fine in /var/lib/hadoop/data/e/yarn/logs/application_1605880843685_1450 on an-worker1105 [analytics]
08:05 <elukey> remove big stderrlog fine in /var/lib/hadoop/data/e/yarn/logs/application_1605880843685_1450 on an-worker1105 [production]
2020-11-20 §
23:38 <mutante> synced puppet-compiler facts - new hosts should be usable in compiler [production]
23:15 <mutante> syncing facts from production masters [puppet-diffs]
22:30 <mutante> cumin1001 - sudo systemctl start cumin-check-aliases -> <+icinga-wm> RECOVERY - Check systemd state on cumin1001 is OK T268369 [production]
21:30 <razzi@cumin1001> END (PASS) - Cookbook sre.ganeti.makevm (exit_code=0) [production]
21:09 <razzi> truncate /var/lib/hadoop/data/u/yarn/logs/application_1605880843685_0581/container_e27_1605880843685_0581_01_000171/stderr logfile on an-worker1098 [analytics]
20:40 <mutante> added new member razzi [puppet-diffs]
20:26 <razzi@cumin1001> START - Cookbook sre.ganeti.makevm [production]
20:09 <razzi@cumin1001> END (PASS) - Cookbook sre.ganeti.makevm (exit_code=0) [production]
19:52 <mutante> releases2002 - systemctl disable wmf_auto_restart_rsync; rm /usr/lib/systemd/system/wmf_auto_restart_rsync.* ; systemctl daemon-reload ; systemctl reset-failed - clear up systemd unit that was not absented and fix Icinga alerts [production]
19:45 <mutante> releases2002 systemctl reset-failed (wmf_auto_restart_rsync.service failed but hopefully fixed) [production]
19:39 <mutante> Icinga: ACKing all the "unhandled CRIT" alerts on clouddb* an an-coord* that have disabled notifications to remove monitoring noise. from 72 to 25 active alerts [production]
19:17 <Jayprakash12345> Deploying app (T267488) [tools.book2scrollv2]
19:14 <razzi@cumin1001> START - Cookbook sre.ganeti.makevm [production]
18:47 <elukey@cumin1001> END (FAIL) - Cookbook sre.hosts.decommission (exit_code=1) [production]
18:42 <elukey@cumin1001> START - Cookbook sre.hosts.decommission [production]
18:37 <elukey@cumin1001> END (FAIL) - Cookbook sre.hosts.decommission (exit_code=1) [production]
18:36 <razzi@cumin1001> END (PASS) - Cookbook sre.ganeti.makevm (exit_code=0) [production]
18:31 <elukey@cumin1001> START - Cookbook sre.hosts.decommission [production]
18:31 <elukey@cumin1001> END (FAIL) - Cookbook sre.hosts.decommission (exit_code=1) [production]
18:18 <elukey@cumin1001> START - Cookbook sre.hosts.decommission [production]
18:14 <dwisehaupt> shifting 100% of thank_you mail through frmxs ahead of tomorrow's banner test - T267259 [production]
17:37 <pt1979@cumin2001> END (FAIL) - Cookbook sre.hosts.downtime (exit_code=99) [production]
17:35 <pt1979@cumin2001> START - Cookbook sre.hosts.downtime [production]
17:32 <razzi@cumin1001> START - Cookbook sre.ganeti.makevm [production]
17:24 <razzi@cumin1001> END (PASS) - Cookbook sre.ganeti.makevm (exit_code=0) [production]
16:48 <volans@cumin1001> END (FAIL) - Cookbook sre.hosts.decommission (exit_code=1) [production]
16:40 <volans@cumin1001> START - Cookbook sre.hosts.decommission [production]
16:29 <razzi@cumin1001> START - Cookbook sre.ganeti.makevm [production]
16:29 <razzi@cumin1001> END (ERROR) - Cookbook sre.ganeti.makevm (exit_code=97) [production]
16:28 <razzi> removed canceled ip address records for kafka-test1002 from netbox [production]
16:11 <pt1979@cumin2001> END (FAIL) - Cookbook sre.hosts.downtime (exit_code=99) [production]
16:09 <pt1979@cumin2001> START - Cookbook sre.hosts.downtime [production]
16:06 <James_F> Zuul: [labs/tools/book2scroll] Provide CI with tox-docker T267488 [releng]