1301-1350 of 10000 results (25ms)
2020-05-08 §
16:48 <cmjohnson@cumin1001> START - Cookbook sre.hosts.downtime [production]
16:39 <pt1979@cumin2001> END (FAIL) - Cookbook sre.hosts.downtime (exit_code=99) [production]
16:37 <pt1979@cumin2001> START - Cookbook sre.hosts.downtime [production]
16:14 <pt1979@cumin2001> END (FAIL) - Cookbook sre.hosts.downtime (exit_code=99) [production]
16:12 <pt1979@cumin2001> START - Cookbook sre.hosts.downtime [production]
15:43 <pt1979@cumin2001> END (FAIL) - Cookbook sre.hosts.downtime (exit_code=99) [production]
15:41 <pt1979@cumin2001> START - Cookbook sre.hosts.downtime [production]
15:36 <ottomata> starting kafka broker on kafka-jumbo1006, same issue on other brokers when they are leaders of offending partitions - T252203 [production]
15:31 <pt1979@cumin2001> END (FAIL) - Cookbook sre.hosts.downtime (exit_code=99) [production]
15:28 <pt1979@cumin2001> START - Cookbook sre.hosts.downtime [production]
15:27 <ottomata> stopping kafka broker on kafka-jumbo1006 to investigate camus import failures - T252203 [production]
14:50 <otto@deploy1001> Finished deploy [analytics/refinery@4a2c530]: fix for camus wrapper, deploy to an-launcher1001 only (duration: 00m 03s) [production]
14:50 <otto@deploy1001> Started deploy [analytics/refinery@4a2c530]: fix for camus wrapper, deploy to an-launcher1001 only [production]
14:05 <akosiaris> T243106 undo experiment with DROP iptable rules this time around. Use mw1331, mw1348 [production]
13:22 <vgutierrez> rolling restart of ats-tls on eqiad, codfw, ulsfo and eqsin - T249335 [production]
13:20 <akosiaris> T243106 redo experiment with DROP iptable rules this time around. Use mw1331, mw1348 [production]
13:16 <akosiaris> T243106 undo experiment with REJECT, DROP iptable rules now that we have envoy in the middle. Use mw1331, mw1348. Experiment done successfully, no issues to the infrastructure. [production]
12:49 <akosiaris> T243106 redo experiment with REJECT, DROP iptable rules now that we have envoy in the middle. Use mw1331, mw1348 [production]
12:49 <akosiaris> T243106 redo experiment with REJECT, DROP iptable rules now that we have envoy in the middle [production]
11:49 <hnowlan> restarting cassandra on restbase2009 for java updates [production]
11:28 <cmjohnson@cumin1001> END (FAIL) - Cookbook sre.hosts.downtime (exit_code=99) [production]
11:25 <cmjohnson@cumin1001> START - Cookbook sre.hosts.downtime [production]
11:08 <akosiaris> repool eqiad eventgate-analytics. Test concluded [production]
11:08 <akosiaris@cumin1001> conftool action : set/pooled=true; selector: name=eqiad,dnsdisc=eventgate-analytics [production]
09:54 <mutante> disabling puppet on puppetmasters temporarily to switch them carefully to use httpd module and not apache module which we want to get rid of [production]
09:52 <akosiaris> depool eqiad eventgate-analytics for a test involving reinitializing the eqiad kubernetes cluster [production]
09:52 <akosiaris@cumin1001> conftool action : set/pooled=false; selector: name=eqiad,dnsdisc=eventgate-analytics [production]
09:51 <akosiaris@cumin1001> conftool action : set/pooled=true; selector: name=eqiad,dnsdisc=eventgate-analytics [production]
09:45 <oblivian@puppetmaster1001> conftool action : set/ttl=10; selector: dnsdisc=eventgate-analytics.* [production]
08:20 <vgutierrez> rolling restart of ats-tls on esams - T249335 [production]
07:19 <vgutierrez> ats-tls restart on cp3050 and cp3052 (max_connections_active_in experiment) - T249335 [production]
07:07 <mutante> phabricator rmdir /var/run/phd/pid - empty and now unused [production]
07:01 <moritzm> installing php5 security updates [production]
05:27 <marostegui@cumin1001> END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) [production]
05:24 <marostegui@cumin1001> START - Cookbook sre.hosts.downtime [production]
05:10 <marostegui> Upgrade pc1010 [production]
00:30 <brennen@deploy1001> rebuilt and synchronized wikiversions files: Revert all wikis except test to 1.35.0-wmf.30 for T252179 [production]
00:19 <brennen> rolling 1.35.0-wmf.31 train back to group0 for T252179 [production]
2020-05-07 §
22:36 <brennen@deploy1001> rebuilt and synchronized wikiversions files: all wikis to 1.35.0-wmf.31 [production]
22:31 <brennen@deploy1001> Synchronized php-1.35.0-wmf.31/extensions/Scribunto/includes/engines/LuaCommon/TitleLibrary.php: [[gerrit:595054|Handle RevisionAccessException with try-catch (T252156)]] (duration: 01m 08s) [production]
20:40 <ryankemper@cumin2001> END (FAIL) - Cookbook sre.wdqs.data-transfer (exit_code=99) [production]
20:37 <ryankemper@cumin2001> START - Cookbook sre.wdqs.data-transfer [production]
20:10 <otto@deploy1001> Synchronized wmf-config/InitialiseSettings.php: wgEventLoggingStreamNames: set initial stream names, as yet unused - T238230 (duration: 01m 07s) [production]
19:12 <brennen@deploy1001> rebuilt and synchronized wikiversions files: Revert group2 wikis to 1.35.0-wmf.30 [production]
19:09 <brennen> rolling 1.35.0-wmf.31 back to group1 [production]
19:09 <XioNoX> Upgrade Routinator 3000 to 0.7.0 on rpki1001 - T252010 [production]
19:05 <brennen@deploy1001> rebuilt and synchronized wikiversions files: all wikis to 1.35.0-wmf.31 [production]
18:25 <ppchelko@deploy1001> Finished deploy [changeprop/deploy@383fba5]: Enable both purging types T252142 (duration: 01m 17s) [production]
18:23 <ppchelko@deploy1001> Started deploy [changeprop/deploy@383fba5]: Enable both purging types T252142 [production]
18:15 <Urbanecm> Morning SWAT done [production]