1-50 of 10000 results (24ms)
2020-12-29 §
15:52 <vgutierrez> reloading nginx on cloudelastic1005 and cloudelastic1006 [production]
15:48 <vgutierrez> triggering a puppet run on cp nodes [production]
15:45 <vgutierrez> restarting acme-chief on acmechief1001 [production]
09:18 <elukey> restart hue to pick up analytics-hive endpoint settings [analytics]
2020-12-28 §
12:32 <arturo> stop doing backups for the dumps project https://gerrit.wikimedia.org/r/c/operations/puppet/+/652182 (T260692) [admin]
12:32 <arturo> stop doing backups for the dumps project https://gerrit.wikimedia.org/r/c/operations/puppet/+/652182 (T260682) [admin]
12:23 <arturo> icinga downtime cloudvirt1026 disk space check until january 5 (T260692) [admin]
09:54 <elukey> reboot an-coord1002 (puppet in D state after issues with broken disk - host in standby, no traffic) [production]
09:00 <wm-bot> <lucaswerkmeister> updated Python packages [tools.quickcategories]
06:15 <andrewbogott> restarting designate-central on cloudservices1003/1004. I'm pretty sure they're distressed because of DB lag but it's worth a try [admin]
2020-12-24 §
12:41 <elukey@cumin1001> END (PASS) - Cookbook sre.dns.netbox (exit_code=0) [production]
12:37 <elukey@cumin1001> START - Cookbook sre.dns.netbox [production]
12:34 <elukey@cumin1001> END (PASS) - Cookbook sre.dns.netbox (exit_code=0) [production]
12:23 <elukey@cumin1001> START - Cookbook sre.dns.netbox [production]
11:22 <volans> running on cumin1001: homer asw2-*-eqiad.mgmt.eqiad.wmnet commit "Fix numbering of an-worker hosts - T260445" [production]
11:08 <hashar> gerrit2001 (replica) restarting Gerrit server [production]
00:45 <legoktm> reset maxmind password [production]
2020-12-23 §
21:33 <cmjohnson@cumin1001> END (PASS) - Cookbook sre.dns.netbox (exit_code=0) [production]
21:30 <cmjohnson@cumin1001> START - Cookbook sre.dns.netbox [production]
20:40 <legoktm> deploying https://gerrit.wikimedia.org/r/651819 [releng]
20:32 <bstorm> Created the directory /srv/misc/shared/wikilink/project on labstore1004 and verified puppet and nfs-exportd are happy T264107 [wikilink]
19:48 <James_F> Zuul: [mediawiki/tools/dependency-analysis] Add composer test CI [releng]
19:20 <bstorm> created clouddb-wikireplicas-proxy-1 and clouddb-wikireplicas-proxy-2 as well as the 16 neutron ports for wikireplicas proxying [clouddb-services]
19:03 <balloons> resized deployment-puppetdb03 to g2.cores2.ram4.disk40 (T270420) [deployment-prep]
19:03 <balloons> resized deployment-puppetdb03 to g2.cores2.ram4.disk40 (T270420) [releng]
16:58 <cmjohnson@cumin1001> END (PASS) - Cookbook sre.dns.netbox (exit_code=0) [production]
16:51 <cmjohnson@cumin1001> START - Cookbook sre.dns.netbox [production]
16:51 <cmjohnson@cumin1001> END (FAIL) - Cookbook sre.dns.netbox (exit_code=99) [production]
16:44 <cmjohnson@cumin1001> START - Cookbook sre.dns.netbox [production]
15:53 <ottomata> point analytics-hive.eqiad.wmnet back at an-coord1001 - T268028 T270768 [analytics]
15:38 <andrewbogott> restarting rabbitmq on cloudcontrol1004; suspected leaks [admin]
15:35 <wm-bot> <lucaswerkmeister> deployed 6d8bae537b (Esperanto verb) [tools.lexeme-forms]
15:33 <andrewbogott> restarting each cloudcontrol galera node in turn to see if that quiets down the syncing warnings [admin]
15:15 <cdanis> disabling puppet on alert1001 for klaxon rollout [production]
14:32 <wm-bot> <lucaswerkmeister> deployed 69f610af18 (Breton noun, without mutation, collective) [tools.lexeme-forms]
12:08 <arturo> move memory out of the swap in cloudcontrol1004 by disabling/enabling it (1Gb swap was being used) [admin]
09:59 <hashar> gerrit: removed old gerrit directory /srv/var-lib-gerrit2-cobalt.wikimedia.org/.gerritcodereview/ (was some tmp dirs for Gerrit jars ) [production]
09:54 <volans> upgraded python3-wmflib to 0.0.5 on cumin1001 [production]
05:54 <ladsgroup@deploy1001> Synchronized wmf-config/InitialiseSettings.php: [[gerrit:651682|Fix typo in autoreview right of eliminators in fawiki]] (duration: 00m 57s) [production]
2020-12-22 §
22:40 <James_F> Zuul: [integration/docroot] Only test PHP 7.3+ from now on [releng]
21:57 <mutante> apt1001 - sudo systemctl status rsync-aptrepo-apt2001.wikimedia.org.service - confirmed timer job is working like the cron before [production]
21:31 <mutante> deploy1002/deploy2002 - apt-get remove --purge php-readline and let puppet reinstall it (7.2 vs 7.3 after gerrit 651158) T265963 [production]
21:26 <andrewbogott> upgrading wikitech-static: mediawiki to 1.35.1 and general apt upgrade [production]
21:16 <James_F> Docker: Building and publoshing tox-labs-striker:0.5.0 [releng]
20:26 <eileen> civicrm revision changed from e86e756807 to 6150267979, config revision is 52f1cbc5dd [production]
19:35 <elukey> restart hive daemons on an-coord1001 to pick up new settings [analytics]
19:32 <mutante> restarting gerrit to pick up config change in gitiles for T269300 [production]
18:29 <andrew@cumin1001> END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on labstore1004.eqiad.wmnet with reason: REIMAGE [production]
18:27 <andrew@cumin1001> START - Cookbook sre.hosts.downtime for 2:00:00 on labstore1004.eqiad.wmnet with reason: REIMAGE [production]
18:22 <bstorm> rebooting the grid master because it is misbehaving following the NFS outage [tools]