2901-2950 of 10000 results (40ms)
2021-09-08 §
13:13 <brennen> gitlab1001: downtiming alerts for 2.5 hours; upgrading to 14.0.10 (T289802) [production]
12:45 <brennen> gitlab: pausing all runners in preparation for upgrade to 14.0.10 (T289802) [production]
11:57 <moritzm> installing curl security updates on stretch [production]
11:09 <jbond> upload statograph_0.1.2 [production]
11:02 <hnowlan@cumin1001> END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on maps1005.eqiad.wmnet with reason: Resyncing from master [production]
11:01 <hnowlan@cumin1001> START - Cookbook sre.hosts.downtime for 5:00:00 on maps1005.eqiad.wmnet with reason: Resyncing from master [production]
11:01 <hnowlan@puppetmaster1001> conftool action : set/pooled=no; selector: name=maps1005.eqiad.wmnet [production]
10:06 <jelto> upgrade gitlab2001 to gitlab-ce=14.0.10-ce.0 [production]
10:03 <jelto@cumin1001> END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on gitlab2001.wikimedia.org with reason: upgrade gitlab2001 to new version https://phabricator.wikmiedia.org/T289802 [production]
10:03 <jelto@cumin1001> START - Cookbook sre.hosts.downtime for 2:00:00 on gitlab2001.wikimedia.org with reason: upgrade gitlab2001 to new version https://phabricator.wikmiedia.org/T289802 [production]
09:38 <godog> start rollout of prometheus-rsyslog-exporter 0.0.0+git20201008-3 to wikimedia.org - T210137 [production]
09:29 <godog> start rollout of prometheus-rsyslog-exporter 0.0.0+git20201008-3 to codfw - T210137 [production]
09:09 <godog> start rollout of prometheus-rsyslog-exporter 0.0.0+git20201008-3 to eqiad - T210137 [production]
07:45 <godog> start rollout of prometheus-rsyslog-exporter 0.0.0+git20201008-3 to eqsin/esams/ulsfo - T210137 [production]
06:46 <ryankemper> [WDQS] Manually running puppet-agent on `miscweb2002.codfw.wmnet,miscweb1002.eqiad.wmnet` [production]
06:45 <ryankemper> [WDQS] Merged https://gerrit.wikimedia.org/r/c/operations/puppet/+/719185 to rollback query.wikidata.org changes [production]
02:59 <eileen> civicrm revision changed from 06ef98593f to 593d01f4fc, config revision is 5f004d94d7 [production]
00:00 <legoktm> legoktm@lists1001:~$ sudo rm -rf /etc/mailman # cleanup as part of 4869d91b0be / T282303 [production]
2021-09-07 §
23:25 <robh@cumin1001> END (PASS) - Cookbook sre.dns.netbox (exit_code=0) [production]
23:20 <robh@cumin1001> START - Cookbook sre.dns.netbox [production]
23:13 <ladsgroup@deploy1002> Synchronized wmf-config/InitialiseSettings.php: Config: [[gerrit:719381|Enable UrlShortener everywhere (T267925)]] (duration: 00m 58s) [production]
23:07 <dpifke@deploy1002> Synchronized wmf-config/profiler.php: Config: [[gerrit:716041|profiler: use seperate pipeline inside k8s pods (T288165)]] (duration: 00m 58s) [production]
22:29 <cstone> SmashPig revision changed from afd362b163 to 3607b16f83 [production]
20:41 <ladsgroup@deploy1002> Synchronized wmf-config/InitialiseSettings.php: Config: [[gerrit:715018|Set $wgWBRepoSettings['tmpNormalizeDataValues'] on all wikis (T251480)]] (duration: 00m 59s) [production]
20:31 <pt1979@cumin2002> END (PASS) - Cookbook sre.dns.netbox (exit_code=0) [production]
20:27 <pt1979@cumin2002> START - Cookbook sre.dns.netbox [production]
17:18 <jgiannelos@deploy1002> helmfile [codfw] Ran 'sync' command on namespace 'push-notifications' for release 'main' . [production]
17:09 <jgiannelos@deploy1002> helmfile [eqiad] Ran 'sync' command on namespace 'push-notifications' for release 'main' . [production]
17:01 <jgiannelos@deploy1002> helmfile [staging] Ran 'sync' command on namespace 'push-notifications' for release 'main' . [production]
16:39 <moritzm> installing jetty9 security updates on buster [production]
16:30 <dzahn@cumin1001> END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5 days, 8:00:00 on planet1002.eqiad.wmnet with reason: known issue [production]
16:30 <dzahn@cumin1001> START - Cookbook sre.hosts.downtime for 5 days, 8:00:00 on planet1002.eqiad.wmnet with reason: known issue [production]
16:30 <dancy@deploy1002> Synchronized README: testing (duration: 00m 59s) [production]
15:18 <akosiaris> run_benchmarky.py against mwdebug.svc.codfw.wmnet for performance tests [production]
15:07 <akosiaris@deploy1002> helmfile [codfw] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' . [production]
15:04 <jbond> upload python-prometheus-client_0.6.0 to stretch-wikimedia [production]
14:50 <mutante> snapshot1015 - manually removed prometheus-puppet-agent-stats from crontab which was sending spam and is now a timer [production]
14:33 <mutante> CI - migrating zuul-merger cronjob to systemd timer (contint*) [production]
14:23 <XioNoX> re-pool esams-eqiad - T288503 [production]
14:23 <cmjohnson@cumin1001> END (FAIL) - Cookbook sre.hosts.downtime (exit_code=99) for 2:00:00 on cloudcephosd1024.eqiad.wmnet with reason: REIMAGE [production]
14:23 <cmjohnson@cumin1001> START - Cookbook sre.hosts.downtime for 2:00:00 on cloudcephosd1024.eqiad.wmnet with reason: REIMAGE [production]
14:22 <cmjohnson@cumin1001> END (FAIL) - Cookbook sre.hosts.downtime (exit_code=99) for 2:00:00 on cloudcephosd1023.eqiad.wmnet with reason: REIMAGE [production]
14:22 <cmjohnson@cumin1001> START - Cookbook sre.hosts.downtime for 2:00:00 on cloudcephosd1023.eqiad.wmnet with reason: REIMAGE [production]
14:17 <marostegui> No more db maintenance on eqiad T288594 [production]
14:08 <mutante> alert1001 - temp disabled puppet, stopped icinga-wm [production]
14:07 <mutante> temp killed icinga-wm because of flooding [production]
14:01 <Emperor> removing pc2010 from orchestrator T289117 [production]
13:59 <Emperor> removing pc2010 from tendril and zarcillo T289117 [production]
13:57 <pt1979@cumin2002> END (PASS) - Cookbook sre.dns.netbox (exit_code=0) [production]
13:57 <XioNoX> drain esams-eqiad for circuit maintenance - T288503 [production]