production SAL

151-200 of 10000 results (21ms)

2021-02-01 §
09:39	<elukey@cumin1001>	START - Cookbook sre.dns.netbox	[production]
09:30	<marostegui@cumin1001>	dbctl commit (dc=all): 'db1175 (re)pooling @ 20%: Slowly pooling db1175 for the first time', diff saved to https://phabricator.wikimedia.org/P14082 and previous config saved to /var/cache/conftool/dbconfig/20210201-093041-root.json	[production]
09:27	<marostegui@cumin1001>	dbctl commit (dc=all): 'db1166 (re)pooling @ 6%: Slowly pooling db1166 for the first time', diff saved to https://phabricator.wikimedia.org/P14081 and previous config saved to /var/cache/conftool/dbconfig/20210201-092722-root.json	[production]
09:27	<dcausse>	restarting blazegraph on wdqs1013	[production]
09:24	<XioNoX>	renumber gr-3/3/0.1 local endpoint on cr1-eqiad	[production]
09:15	<marostegui@cumin1001>	dbctl commit (dc=all): 'db1175 (re)pooling @ 15%: Slowly pooling db1175 for the first time', diff saved to https://phabricator.wikimedia.org/P14080 and previous config saved to /var/cache/conftool/dbconfig/20210201-091538-root.json	[production]
09:12	<marostegui@cumin1001>	dbctl commit (dc=all): 'db1166 (re)pooling @ 5%: Slowly pooling db1166 for the first time', diff saved to https://phabricator.wikimedia.org/P14079 and previous config saved to /var/cache/conftool/dbconfig/20210201-091218-root.json	[production]
09:04	<gilles@deploy1001>	Finished deploy [performance/navtiming@3215510]: T271208 browser_minor is needed for Mobile Safari allowlist (duration: 00m 05s)	[production]
09:04	<gilles@deploy1001>	Started deploy [performance/navtiming@3215510]: T271208 browser_minor is needed for Mobile Safari allowlist	[production]
09:03	<filippo@cumin1001>	END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on ms-be1054.eqiad.wmnet with reason: reboot	[production]
09:03	<filippo@cumin1001>	START - Cookbook sre.hosts.downtime for 1:00:00 on ms-be1054.eqiad.wmnet with reason: reboot	[production]
09:00	<marostegui@cumin1001>	dbctl commit (dc=all): 'db1175 (re)pooling @ 12%: Slowly pooling db1175 for the first time', diff saved to https://phabricator.wikimedia.org/P14078 and previous config saved to /var/cache/conftool/dbconfig/20210201-090034-root.json	[production]
09:00	<elukey@cumin1001>	END (FAIL) - Cookbook sre.dns.netbox (exit_code=99)	[production]
08:57	<marostegui@cumin1001>	dbctl commit (dc=all): 'db1166 (re)pooling @ 3%: Slowly pooling db1166 for the first time', diff saved to https://phabricator.wikimedia.org/P14077 and previous config saved to /var/cache/conftool/dbconfig/20210201-085714-root.json	[production]
08:56	<marostegui>	Stop MySQL on db1089 - T273417	[production]
08:53	<gilles@deploy1001>	Finished deploy [performance/navtiming@1e02d76]: T271208 Add more debug logging (duration: 00m 05s)	[production]
08:53	<gilles@deploy1001>	Started deploy [performance/navtiming@1e02d76]: T271208 Add more debug logging	[production]
08:53	<elukey@cumin1001>	START - Cookbook sre.dns.netbox	[production]
08:45	<marostegui@cumin1001>	dbctl commit (dc=all): 'db1175 (re)pooling @ 10%: Slowly pooling db1175 for the first time', diff saved to https://phabricator.wikimedia.org/P14075 and previous config saved to /var/cache/conftool/dbconfig/20210201-084531-root.json	[production]
08:45	<marostegui@cumin1001>	dbctl commit (dc=all): 'Remove db1089 from dbctl T273417', diff saved to https://phabricator.wikimedia.org/P14074 and previous config saved to /var/cache/conftool/dbconfig/20210201-084523-marostegui.json	[production]
08:42	<marostegui@cumin1001>	dbctl commit (dc=all): 'db1166 (re)pooling @ 4%: Slowly pooling db1166 for the first time', diff saved to https://phabricator.wikimedia.org/P14073 and previous config saved to /var/cache/conftool/dbconfig/20210201-084211-root.json	[production]
08:29	<marostegui@cumin1001>	dbctl commit (dc=all): 'db1175 (re)pooling @ 7%: Slowly pooling db1175 for the first time', diff saved to https://phabricator.wikimedia.org/P14072 and previous config saved to /var/cache/conftool/dbconfig/20210201-082933-root.json	[production]
08:27	<marostegui@cumin1001>	dbctl commit (dc=all): 'db1166 (re)pooling @ 2%: Slowly pooling db1166 for the first time', diff saved to https://phabricator.wikimedia.org/P14071 and previous config saved to /var/cache/conftool/dbconfig/20210201-082707-root.json	[production]
08:17	<godog>	swift codfw-prod decrease HDD weight for ms-be20[16-27] - T272837	[production]
08:15	<marostegui@cumin1001>	dbctl commit (dc=all): 'Pool db1166 with minimal weight for the first time T258361', diff saved to https://phabricator.wikimedia.org/P14070 and previous config saved to /var/cache/conftool/dbconfig/20210201-081554-marostegui.json	[production]
08:14	<marostegui@cumin1001>	dbctl commit (dc=all): 'db1175 (re)pooling @ 5%: Slowly pooling db1175 for the first time', diff saved to https://phabricator.wikimedia.org/P14069 and previous config saved to /var/cache/conftool/dbconfig/20210201-081429-root.json	[production]
08:05	<marostegui@cumin1001>	dbctl commit (dc=all): 'Add db1166 to dbctl, depooled T258361', diff saved to https://phabricator.wikimedia.org/P14068 and previous config saved to /var/cache/conftool/dbconfig/20210201-080520-marostegui.json	[production]
07:59	<marostegui@cumin1001>	dbctl commit (dc=all): 'db1175 (re)pooling @ 3%: Slowly pooling db1175 for the first time', diff saved to https://phabricator.wikimedia.org/P14067 and previous config saved to /var/cache/conftool/dbconfig/20210201-075926-root.json	[production]
07:44	<marostegui@cumin1001>	dbctl commit (dc=all): 'db1175 (re)pooling @ 2%: Slowly pooling db1175 for the first time', diff saved to https://phabricator.wikimedia.org/P14066 and previous config saved to /var/cache/conftool/dbconfig/20210201-074422-root.json	[production]
07:36	<marostegui@cumin1001>	dbctl commit (dc=all): 'Add db1175 with some more minimal weight T258361', diff saved to https://phabricator.wikimedia.org/P14065 and previous config saved to /var/cache/conftool/dbconfig/20210201-073603-marostegui.json	[production]
07:04	<marostegui@cumin1001>	dbctl commit (dc=all): 'db1094 (re)pooling @ 100%: After fixing replication', diff saved to https://phabricator.wikimedia.org/P14064 and previous config saved to /var/cache/conftool/dbconfig/20210201-070429-root.json	[production]
06:49	<marostegui@cumin1001>	dbctl commit (dc=all): 'db1094 (re)pooling @ 75%: After fixing replication', diff saved to https://phabricator.wikimedia.org/P14063 and previous config saved to /var/cache/conftool/dbconfig/20210201-064926-root.json	[production]
06:39	<marostegui>	Run analyze table on db2071 and db2102	[production]
06:34	<marostegui@cumin1001>	dbctl commit (dc=all): 'db1094 (re)pooling @ 50%: After fixing replication', diff saved to https://phabricator.wikimedia.org/P14062 and previous config saved to /var/cache/conftool/dbconfig/20210201-063422-root.json	[production]
06:23	<marostegui@cumin1001>	dbctl commit (dc=all): 'Add db1175 to dbctl, depooled T258361', diff saved to https://phabricator.wikimedia.org/P14061 and previous config saved to /var/cache/conftool/dbconfig/20210201-062358-marostegui.json	[production]
06:19	<marostegui@cumin1001>	dbctl commit (dc=all): 'db1094 (re)pooling @ 25%: After fixing replication', diff saved to https://phabricator.wikimedia.org/P14060 and previous config saved to /var/cache/conftool/dbconfig/20210201-061919-root.json	[production]
06:10	<marostegui>	Upgrade db2071 and db2102 to 10.4.18	[production]
06:04	<marostegui@cumin1001>	dbctl commit (dc=all): 'db1094 (re)pooling @ 10%: After fixing replication', diff saved to https://phabricator.wikimedia.org/P14059 and previous config saved to /var/cache/conftool/dbconfig/20210201-060415-root.json	[production]
05:58	<marostegui@cumin1001>	dbctl commit (dc=all): 'Depool db1094', diff saved to https://phabricator.wikimedia.org/P14058 and previous config saved to /var/cache/conftool/dbconfig/20210201-055851-marostegui.json	[production]
2021-01-29 §
23:26	<razzi@cumin1001>	END (PASS) - Cookbook sre.kafka.reboot-workers (exit_code=0) for Kafka test cluster: Reboot kafka nodes - razzi@cumin1001	[production]
22:36	<dancy@deploy1001>	Finished scap: MW servers complaining about l10n files after .27 rollback (duration: 07m 22s)	[production]
22:29	<dancy@deploy1001>	Started scap: MW servers complaining about l10n files after .27 rollback	[production]
22:26	<dancy@deploy1001>	rebuilt and synchronized wikiversions files: all wikis to 1.36.0-wmf.27	[production]
22:20	<reedy@deploy1001>	Synchronized php-1.36.0-wmf.27/includes/parser/CacheTime.php: CacheTime: Extra protection for rollback unserialization T273007 (duration: 01m 00s)	[production]
22:14	<dancy@deploy1001>	rebuilt and synchronized wikiversions files: all wikis to 1.36.0-wmf.28	[production]
22:09	<dancy@deploy1001>	scap failed: average error rate on 8/9 canaries increased by 10x (rerun with --force to override this check, see https://logstash.wikimedia.org/goto/83629bcb5560d11e61d3085c89dd9ed6 for details)	[production]
21:42	<razzi>	rebalance kafka partitions for codfw.resource_change	[production]
21:40	<razzi@cumin1001>	START - Cookbook sre.kafka.reboot-workers for Kafka test cluster: Reboot kafka nodes - razzi@cumin1001	[production]
19:26	<razzi@cumin1001>	END (FAIL) - Cookbook sre.kafka.reboot-workers (exit_code=99) for Kafka test cluster: Reboot kafka nodes - razzi@cumin1001	[production]
19:26	<razzi@cumin1001>	START - Cookbook sre.kafka.reboot-workers for Kafka test cluster: Reboot kafka nodes - razzi@cumin1001	[production]