production SAL

4501-4550 of 10000 results (41ms)

2021-09-08 §
14:57	<marostegui>	Retroactive: started to warm up eqiad databaes	[production]
14:57	<moritzm>	installing 4.19.194 kernels on stretch systems with 4.19.x (no reboots yet)	[production]
14:54	<brennen>	gitlab: upgrading gitlab2001, followed by gitlab1001, to 14.2.3 (T289802)	[production]
14:53	<robh@cumin1001>	END (FAIL) - Cookbook sre.hosts.downtime (exit_code=99) for 2:00:00 on ms-be1067.eqiad.wmnet with reason: REIMAGE	[production]
14:51	<robh@cumin1001>	START - Cookbook sre.hosts.downtime for 2:00:00 on ms-be1067.eqiad.wmnet with reason: REIMAGE	[production]
14:33	<moritzm>	installing zeromq3 security updates	[production]
13:50	<mbsantos@deploy1002>	Finished deploy [kartotherian/deploy@eb211ac]: kartotherian: restore v4 maxzoom to z15 (duration: 06m 42s)	[production]
13:44	<mbsantos@deploy1002>	Started deploy [kartotherian/deploy@eb211ac]: kartotherian: restore v4 maxzoom to z15	[production]
13:38	<brennen>	gitlab: upgrading gitlab2001, followed by gitlab1001, to 14.1.5 (T289802)	[production]
13:13	<brennen>	gitlab1001: downtiming alerts for 2.5 hours; upgrading to 14.0.10 (T289802)	[production]
12:45	<brennen>	gitlab: pausing all runners in preparation for upgrade to 14.0.10 (T289802)	[production]
11:57	<moritzm>	installing curl security updates on stretch	[production]
11:09	<jbond>	upload statograph_0.1.2	[production]
11:02	<hnowlan@cumin1001>	END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on maps1005.eqiad.wmnet with reason: Resyncing from master	[production]
11:01	<hnowlan@cumin1001>	START - Cookbook sre.hosts.downtime for 5:00:00 on maps1005.eqiad.wmnet with reason: Resyncing from master	[production]
11:01	<hnowlan@puppetmaster1001>	conftool action : set/pooled=no; selector: name=maps1005.eqiad.wmnet	[production]
10:06	<jelto>	upgrade gitlab2001 to gitlab-ce=14.0.10-ce.0	[production]
10:03	<jelto@cumin1001>	END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on gitlab2001.wikimedia.org with reason: upgrade gitlab2001 to new version https://phabricator.wikmiedia.org/T289802	[production]
10:03	<jelto@cumin1001>	START - Cookbook sre.hosts.downtime for 2:00:00 on gitlab2001.wikimedia.org with reason: upgrade gitlab2001 to new version https://phabricator.wikmiedia.org/T289802	[production]
09:38	<godog>	start rollout of prometheus-rsyslog-exporter 0.0.0+git20201008-3 to wikimedia.org - T210137	[production]
09:29	<godog>	start rollout of prometheus-rsyslog-exporter 0.0.0+git20201008-3 to codfw - T210137	[production]
09:09	<godog>	start rollout of prometheus-rsyslog-exporter 0.0.0+git20201008-3 to eqiad - T210137	[production]
07:45	<godog>	start rollout of prometheus-rsyslog-exporter 0.0.0+git20201008-3 to eqsin/esams/ulsfo - T210137	[production]
06:46	<ryankemper>	[WDQS] Manually running puppet-agent on `miscweb2002.codfw.wmnet,miscweb1002.eqiad.wmnet`	[production]
06:45	<ryankemper>	[WDQS] Merged https://gerrit.wikimedia.org/r/c/operations/puppet/+/719185 to rollback query.wikidata.org changes	[production]
02:59	<eileen>	civicrm revision changed from 06ef98593f to 593d01f4fc, config revision is 5f004d94d7	[production]
00:00	<legoktm>	legoktm@lists1001:~$ sudo rm -rf /etc/mailman # cleanup as part of 4869d91b0be / T282303	[production]
2021-09-07 §
23:25	<robh@cumin1001>	END (PASS) - Cookbook sre.dns.netbox (exit_code=0)	[production]
23:20	<robh@cumin1001>	START - Cookbook sre.dns.netbox	[production]
23:13	<ladsgroup@deploy1002>	Synchronized wmf-config/InitialiseSettings.php: Config: [[gerrit:719381\|Enable UrlShortener everywhere (T267925)]] (duration: 00m 58s)	[production]
23:07	<dpifke@deploy1002>	Synchronized wmf-config/profiler.php: Config: [[gerrit:716041\|profiler: use seperate pipeline inside k8s pods (T288165)]] (duration: 00m 58s)	[production]
22:29	<cstone>	SmashPig revision changed from afd362b163 to 3607b16f83	[production]
20:41	<ladsgroup@deploy1002>	Synchronized wmf-config/InitialiseSettings.php: Config: [[gerrit:715018\|Set $wgWBRepoSettings['tmpNormalizeDataValues'] on all wikis (T251480)]] (duration: 00m 59s)	[production]
20:31	<pt1979@cumin2002>	END (PASS) - Cookbook sre.dns.netbox (exit_code=0)	[production]
20:27	<pt1979@cumin2002>	START - Cookbook sre.dns.netbox	[production]
17:18	<jgiannelos@deploy1002>	helmfile [codfw] Ran 'sync' command on namespace 'push-notifications' for release 'main' .	[production]
17:09	<jgiannelos@deploy1002>	helmfile [eqiad] Ran 'sync' command on namespace 'push-notifications' for release 'main' .	[production]
17:01	<jgiannelos@deploy1002>	helmfile [staging] Ran 'sync' command on namespace 'push-notifications' for release 'main' .	[production]
16:39	<moritzm>	installing jetty9 security updates on buster	[production]
16:30	<dzahn@cumin1001>	END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5 days, 8:00:00 on planet1002.eqiad.wmnet with reason: known issue	[production]
16:30	<dzahn@cumin1001>	START - Cookbook sre.hosts.downtime for 5 days, 8:00:00 on planet1002.eqiad.wmnet with reason: known issue	[production]
16:30	<dancy@deploy1002>	Synchronized README: testing (duration: 00m 59s)	[production]
15:18	<akosiaris>	run_benchmarky.py against mwdebug.svc.codfw.wmnet for performance tests	[production]
15:07	<akosiaris@deploy1002>	helmfile [codfw] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .	[production]
15:04	<jbond>	upload python-prometheus-client_0.6.0 to stretch-wikimedia	[production]
14:50	<mutante>	snapshot1015 - manually removed prometheus-puppet-agent-stats from crontab which was sending spam and is now a timer	[production]
14:33	<mutante>	CI - migrating zuul-merger cronjob to systemd timer (contint*)	[production]
14:23	<XioNoX>	re-pool esams-eqiad - T288503	[production]
14:23	<cmjohnson@cumin1001>	END (FAIL) - Cookbook sre.hosts.downtime (exit_code=99) for 2:00:00 on cloudcephosd1024.eqiad.wmnet with reason: REIMAGE	[production]
14:23	<cmjohnson@cumin1001>	START - Cookbook sre.hosts.downtime for 2:00:00 on cloudcephosd1024.eqiad.wmnet with reason: REIMAGE	[production]