1-50 of 10000 results (24ms)
2020-12-29 §
15:52 <vgutierrez> reloading nginx on cloudelastic1005 and cloudelastic1006 [production]
15:48 <vgutierrez> triggering a puppet run on cp nodes [production]
15:45 <vgutierrez> restarting acme-chief on acmechief1001 [production]
2020-12-28 §
09:54 <elukey> reboot an-coord1002 (puppet in D state after issues with broken disk - host in standby, no traffic) [production]
2020-12-24 §
12:41 <elukey@cumin1001> END (PASS) - Cookbook sre.dns.netbox (exit_code=0) [production]
12:37 <elukey@cumin1001> START - Cookbook sre.dns.netbox [production]
12:34 <elukey@cumin1001> END (PASS) - Cookbook sre.dns.netbox (exit_code=0) [production]
12:23 <elukey@cumin1001> START - Cookbook sre.dns.netbox [production]
11:22 <volans> running on cumin1001: homer asw2-*-eqiad.mgmt.eqiad.wmnet commit "Fix numbering of an-worker hosts - T260445" [production]
11:08 <hashar> gerrit2001 (replica) restarting Gerrit server [production]
00:45 <legoktm> reset maxmind password [production]
2020-12-23 §
21:33 <cmjohnson@cumin1001> END (PASS) - Cookbook sre.dns.netbox (exit_code=0) [production]
21:30 <cmjohnson@cumin1001> START - Cookbook sre.dns.netbox [production]
16:58 <cmjohnson@cumin1001> END (PASS) - Cookbook sre.dns.netbox (exit_code=0) [production]
16:51 <cmjohnson@cumin1001> START - Cookbook sre.dns.netbox [production]
16:51 <cmjohnson@cumin1001> END (FAIL) - Cookbook sre.dns.netbox (exit_code=99) [production]
16:44 <cmjohnson@cumin1001> START - Cookbook sre.dns.netbox [production]
15:15 <cdanis> disabling puppet on alert1001 for klaxon rollout [production]
09:59 <hashar> gerrit: removed old gerrit directory /srv/var-lib-gerrit2-cobalt.wikimedia.org/.gerritcodereview/ (was some tmp dirs for Gerrit jars ) [production]
09:54 <volans> upgraded python3-wmflib to 0.0.5 on cumin1001 [production]
05:54 <ladsgroup@deploy1001> Synchronized wmf-config/InitialiseSettings.php: [[gerrit:651682|Fix typo in autoreview right of eliminators in fawiki]] (duration: 00m 57s) [production]
2020-12-22 §
21:57 <mutante> apt1001 - sudo systemctl status rsync-aptrepo-apt2001.wikimedia.org.service - confirmed timer job is working like the cron before [production]
21:31 <mutante> deploy1002/deploy2002 - apt-get remove --purge php-readline and let puppet reinstall it (7.2 vs 7.3 after gerrit 651158) T265963 [production]
21:26 <andrewbogott> upgrading wikitech-static: mediawiki to 1.35.1 and general apt upgrade [production]
20:26 <eileen> civicrm revision changed from e86e756807 to 6150267979, config revision is 52f1cbc5dd [production]
19:32 <mutante> restarting gerrit to pick up config change in gitiles for T269300 [production]
18:29 <andrew@cumin1001> END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on labstore1004.eqiad.wmnet with reason: REIMAGE [production]
18:27 <andrew@cumin1001> START - Cookbook sre.hosts.downtime for 2:00:00 on labstore1004.eqiad.wmnet with reason: REIMAGE [production]
17:27 <andrewbogott> shutting down labstore1004 in preparation for move and reimage [production]
16:51 <mforns@deploy1001> Finished deploy [analytics/refinery@21c0c89] (thin): Regular analytics weekly train THIN [analytics/refinery@Ie7bce02179547ee4c6756d52f9956f492c5b4df6] (duration: 00m 08s) [production]
16:51 <mforns@deploy1001> Started deploy [analytics/refinery@21c0c89] (thin): Regular analytics weekly train THIN [analytics/refinery@Ie7bce02179547ee4c6756d52f9956f492c5b4df6] [production]
16:48 <volans> restarted ferm on ms-be1026 (failed with DNS query for 'ms-be1055.eqiad.wmnet' failed: query timed out ) [production]
16:15 <bstorm> downtimed and stopped puppet on labstore1004 and labstore1005 for failover T266202 [production]
15:23 <jgiannelos@deploy1001> helmfile [eqiad] Ran 'sync' command on namespace 'wikifeeds' for release 'production' . [production]
15:12 <jgiannelos@deploy1001> helmfile [codfw] Ran 'sync' command on namespace 'wikifeeds' for release 'production' . [production]
15:08 <jgiannelos@deploy1001> helmfile [staging] Ran 'sync' command on namespace 'wikifeeds' for release 'staging' . [production]
11:52 <marostegui> Set db1151 to writable T269324 [production]
11:10 <jbond42> upload puppet 5.5.22 to jessie-wikimedia [production]
11:02 <jbond42> upload puppet 5.5.22 to stretch-wikimedia [production]
10:51 <volans@cumin2001> test SAL message from wmflib, please ignore [production]
10:06 <volans> upgraded python3-wmflib to 0.0.5 on cumin2001 [production]
08:52 <hashar> gerrit: running jhat heap analyzer on gerrit2001 # T263008 [production]
07:27 <elukey> reboot stat100[4-8] (analytics hadoop clients) for kernel upgrades [production]
00:20 <crusnov@deploy1001> Finished deploy [netbox/deploy@b17db99]: Redeploy of 2.9.10 to netbox-dev for dep test (duration: 00m 54s) [production]
00:19 <crusnov@deploy1001> Started deploy [netbox/deploy@b17db99]: Redeploy of 2.9.10 to netbox-dev for dep test [production]
2020-12-21 §
23:20 <legoktm@deploy1001> Synchronized docroot/noc/conf/index.php: noc: Fix "Currently active MediaWiki versions" (T235338) (duration: 00m 54s) [production]
22:26 <crusnov@deploy1001> Finished deploy [netbox/deploy@0362a12]: Deploy of 2.9.10 to netbox-dev for script testing p2 (duration: 00m 05s) [production]
22:26 <crusnov@deploy1001> Started deploy [netbox/deploy@0362a12]: Deploy of 2.9.10 to netbox-dev for script testing p2 [production]
22:26 <crusnov@deploy1001> Finished deploy [netbox/deploy@0362a12]: Deploy of 2.9.10 to netbox-dev for script testing (duration: 01m 01s) [production]
22:25 <sbassett> Deployed security patch T270453 [production]