__all__ SAL

401-450 of 10000 results (41ms)

2021-02-24 §
00:03	<pt1979@cumin2001>	END (FAIL) - Cookbook sre.hosts.downtime (exit_code=99) for 2:00:00 on db2145.codfw.wmnet with reason: REIMAGE	[production]
00:02	<pt1979@cumin2001>	START - Cookbook sre.hosts.downtime for 2:00:00 on db2145.codfw.wmnet with reason: REIMAGE	[production]
2021-02-23 §
23:11	<bstorm>	draining a bunch of k8s workers to clean up after dumps changes T272397	[tools]
23:06	<bstorm>	draining tools-k8s-worker-55 to clean up after dumps changes T272397	[tools]
22:52	<chaomodus>	Netbox 2.10 upgrade complete T265084	[production]
22:43	<bstorm>	set --property hw_scsi_model=virtio-scsi and --property hw_disk_bus=scsi on the main buster image in glance on eqiad1 T275430	[admin]
22:40	<bstorm>	rebuild the canary for 1028 after image changes and all is well T275430	[cloudvirt-canary]
22:39	<wm-bot>	<root> Stopping jdk11 webservice in CrashLoopBackOff caused by missing extra arguments to tell the pod what to run.	[tools.jb]
22:28	<crusnov@deploy1001>	Finished deploy [netbox/deploy@dabbf5e]: Deploying Netbox 2.10.4-wmf to production T265084 (duration: 06m 11s)	[production]
22:25	<wm-bot>	<root> Deleted Error state cronjob pods	[tools.citationhunt]
22:25	<ppchelko@deploy1001>	helmfile [eqiad] Ran 'sync' command on namespace 'changeprop-jobqueue' for release 'staging' .	[production]
22:25	<ppchelko@deploy1001>	helmfile [eqiad] Ran 'sync' command on namespace 'changeprop-jobqueue' for release 'production' .	[production]
22:23	<ppchelko@deploy1001>	helmfile [codfw] Ran 'sync' command on namespace 'changeprop-jobqueue' for release 'staging' .	[production]
22:23	<ppchelko@deploy1001>	helmfile [codfw] Ran 'sync' command on namespace 'changeprop-jobqueue' for release 'production' .	[production]
22:22	<wm-bot>	<root> Clean up completed pods	[tools.citationhunt]
22:22	<crusnov@deploy1001>	Started deploy [netbox/deploy@dabbf5e]: Deploying Netbox 2.10.4-wmf to production T265084	[production]
22:21	<ppchelko@deploy1001>	helmfile [staging] Ran 'sync' command on namespace 'changeprop-jobqueue' for release 'production' .	[production]
22:21	<ppchelko@deploy1001>	helmfile [staging] Ran 'sync' command on namespace 'changeprop-jobqueue' for release 'staging' .	[production]
22:17	<chaomodus>	deploying Netbox 2.10 to production and associated work	[production]
21:48	<otto@deploy1001>	Synchronized wmf-config/InitialiseSettings.php: Fix typos in wgEventLoggingSchemas (duration: 01m 05s)	[production]
21:38	<jhuneidi@deploy1001>	rebuilt and synchronized wikiversions files: group0 wikis to 1.36.0-wmf.32 refs T274936	[production]
21:36	<ebernhardson@deploy1001>	Finished deploy [wikimedia/discovery/analytics@1344853]: apply spark env_vars to executors too (duration: 01m 46s)	[production]
21:34	<ebernhardson@deploy1001>	Started deploy [wikimedia/discovery/analytics@1344853]: apply spark env_vars to executors too	[production]
21:28	<jhuneidi@deploy1001>	Finished scap: testwikis wikis to 1.36.0-wmf.32 refs T274936 (duration: 36m 52s)	[production]
21:20	<wm-bot>	<root> Hard stop/start cycle. Pod in CrashLoopBackOff with average restart every 5 minuted for the last 2 months.	[tools.simplewd]
21:20	<ottomata>	started nodemanager on an-worker1112	[analytics]
21:15	<razzi>	rebalance kafka partitions for webrequest_upload partition 2	[analytics]
21:03	<otto@deploy1001>	helmfile [codfw] Ran 'sync' command on namespace 'eventgate-analytics-external' for release 'production' .	[production]
21:03	<otto@deploy1001>	helmfile [codfw] Ran 'sync' command on namespace 'eventgate-analytics-external' for release 'canary' .	[production]
21:00	<ebernhardson@deploy1001>	Finished deploy [wikimedia/discovery/analytics@46a8ae1]: ores_bulk_ingest: namespace is not plural (duration: 01m 41s)	[production]
21:00	<otto@deploy1001>	helmfile [eqiad] Ran 'sync' command on namespace 'eventgate-analytics-external' for release 'canary' .	[production]
20:59	<otto@deploy1001>	helmfile [eqiad] Ran 'sync' command on namespace 'eventgate-analytics-external' for release 'production' .	[production]
20:58	<ebernhardson@deploy1001>	Started deploy [wikimedia/discovery/analytics@46a8ae1]: ores_bulk_ingest: namespace is not plural	[production]
20:56	<dzahn@cumin1001>	END (PASS) - Cookbook sre.ganeti.makevm (exit_code=0) for new host gitlab1002.eqiad.wmnet	[production]
20:55	<wm-bot>	<root> Stopped webservice. Pod in CrashLoopBackoff and restarting did no seem to help.	[tools.wikiloop]
20:52	<jhuneidi@deploy1001>	Started scap: testwikis wikis to 1.36.0-wmf.32 refs T274936	[production]
20:47	<wm-bot>	<root> Deleted deployment.apps/lilywhite.bot which was spawning pods into CrashLoopBackoff due to missing /data/project/lekhaki/tool-lekhaki/main.js entrypoint file.	[tools.lekhaki]
20:44	<ppchelko@deploy1001>	Synchronized wmf-config/CommonSettings.php: No-op: math enable talking to mathoid directly in labs, T274436 (duration: 00m 57s)	[production]
20:44	<wm-bot>	<root> Deleted "test" deployment and related pod stuck in CrashLoopBackoff.	[tools.adhs-wde]
20:38	<otto@deploy1001>	Synchronized wmf-config/InitialiseSettings.php: Fix typo in visualeditortemplatedialoguse - T275015 (duration: 01m 01s)	[production]
20:36	<andrewbogott>	adding r/o access to the eqiad1-glance-images ceph pool for the client.eqiad1-compute for T275430	[admin]
20:13	<razzi@cumin1001>	END (PASS) - Cookbook sre.kafka.reboot-workers (exit_code=0) for Kafka jumbo cluster: Reboot kafka nodes - razzi@cumin1001	[production]
20:04	<dzahn@cumin1001>	START - Cookbook sre.ganeti.makevm for new host gitlab1002.eqiad.wmnet	[production]
19:54	<ryankemper@cumin1001>	END (FAIL) - Cookbook sre.wdqs.data-transfer (exit_code=99)	[production]
19:54	<ryankemper@cumin1001>	START - Cookbook sre.wdqs.data-transfer	[production]
19:49	<ryankemper@cumin1001>	END (FAIL) - Cookbook sre.wdqs.data-transfer (exit_code=99)	[production]
19:49	<ryankemper@cumin1001>	START - Cookbook sre.wdqs.data-transfer	[production]
19:43	<ryankemper>	[WDQS Deploy] Disk space low on `wdqs1009`, rolling back so that can be addressed	[production]
19:43	<ryankemper@deploy1001>	Finished deploy [wdqs/wdqs@b5fc9d5]: 0.3.64 (duration: 08m 01s)	[production]
19:38	<otto@deploy1001>	Synchronized wmf-config/InitialiseSettings.php: Declare WMDE Technical Wishes streams and migrate to EventGate on testwiki (duration: 02m 41s)	[production]