2021-05-09
§
|
21:44 |
<legoktm> |
restarted mailman3 again (T282348) pymysql.err.InternalError: (1205, 'Lock wait timeout exceeded; try restarting transaction') |
[production] |
18:28 |
<legoktm> |
systemctl restart mailman3, bounce runner died again (T282348) |
[production] |
13:50 |
<wm-bot> |
<lucaswerkmeister> deployed 5951b46450 (fix lang= and dir= on index) |
[tools.lexeme-forms] |
10:53 |
<arturo> |
icinga-downtime cloudmetrics1002 for 3 months (T275605) |
[admin] |
10:52 |
<aborrero@cumin1001> |
END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 180 days, 0:00:00 on cloudmetrics1002.eqiad.wmnet with reason: T275605 |
[production] |
10:52 |
<aborrero@cumin1001> |
START - Cookbook sre.hosts.downtime for 180 days, 0:00:00 on cloudmetrics1002.eqiad.wmnet with reason: T275605 |
[production] |
09:16 |
<legoktm> |
mailman3 live hacked patch at https://phabricator.wikimedia.org/T282348#7072358 to fix bounce queue |
[production] |
06:55 |
<Majavah> |
clear error state from tools-sgeexec-0916 |
[tools] |
06:21 |
<legoktm> |
restarting mailman3 service, bounce runner died |
[production] |
04:27 |
<Amir1> |
starting upgrade of batch H of mailing lists (T280322) |
[production] |
2021-05-07
§
|
21:40 |
<legoktm> |
deleted education@ from MM3, didn't import properly |
[production] |
21:35 |
<legoktm> |
deleted festivalsommer-teilnehmer from MM3, didn't import properly |
[production] |
21:33 |
<legoktm> |
fixed owner for wdqs-gui-build list |
[production] |
20:52 |
<wm-bot> |
<bd808> Restarted to pick up 560a22a (T243394) |
[tools.jouncebot] |
20:07 |
<wm-bot> |
<bd808> Restarting due to irc<->telegram message drops |
[tools.bridgebot] |
19:48 |
<pt1979@cumin2001> |
END (PASS) - Cookbook sre.dns.netbox (exit_code=0) |
[production] |
19:42 |
<pt1979@cumin2001> |
START - Cookbook sre.dns.netbox |
[production] |
18:55 |
<legoktm> |
deleted daily-article-l from mailman3 after failed import |
[production] |
18:33 |
<brennen@deploy1002> |
rebuilt and synchronized wikiversions files: all wikis to 1.37.0-wmf.4 |
[production] |
18:28 |
<brennen@deploy1002> |
Synchronized php: group1 wikis to 1.37.0-wmf.4 (duration: 01m 07s) |
[production] |
18:27 |
<brennen@deploy1002> |
rebuilt and synchronized wikiversions files: group1 wikis to 1.37.0-wmf.4 |
[production] |
18:23 |
<brennen> |
1.37.0-wmf.4 train status (T281145): blockers appear resolved, going ahead in the interest of not having a split deploy over weekend |
[production] |
18:07 |
<Majavah> |
generate and add k8s haproxy keepalived password (profile::toolforge::k8s::haproxy::keepalived_password) to private puppet repo |
[tools] |
17:50 |
<brennen@deploy1002> |
Synchronized php-1.37.0-wmf.4/includes/cache/LinkBatch.php: Backport: [[gerrit:685901|LinkBatch: skip bad input (T282180 T282070)]] (duration: 01m 06s) |
[production] |
17:25 |
<andrew@deploy1002> |
Finished deploy [horizon/deploy@20f479e]: updated trove -> codfw1dev (duration: 01m 55s) |
[production] |
17:23 |
<andrew@deploy1002> |
Started deploy [horizon/deploy@20f479e]: updated trove -> codfw1dev |
[production] |
17:15 |
<bstorm> |
recreated recordset of k8s.tools.eqiad1.wikimedia.cloud as CNAME to k8s.svc.tools.eqiad1.wikimedia.cloud T282227 |
[tools] |
17:12 |
<bstorm> |
created A record of k8s.svc.tools.eqiad1.wikimedia.cloud pointing at current cluster with TTL of 300 for quick initial failover when the new set of haproxy nodes are ready T282227 |
[tools] |
17:00 |
<bstorm> |
deleted "toolsbeta-test-k8s-haproxy-2", "toolsbeta-test-k8s-haproxy-1" when the dns caches finally dropped T282227 |
[toolsbeta] |
16:37 |
<James_F> |
Zuul: [operations/software/mailman-templates] Add CI of debian-glue T282018 |
[releng] |
16:30 |
<bstorm> |
recreated k8s.toolsbeta.eqiad1.wikimedia.cloud. as a CNAME to k8s.svc.toolsbeta.eqiad1.wikimedia.cloud. T282227 |
[toolsbeta] |
16:16 |
<Majavah> |
create record k8s.svc.toolsbeta.eqiad1.wikimedia.cloud. pointing to haproxy vip T282227 |
[toolsbeta] |
15:10 |
<andrew@deploy1002> |
Finished deploy [horizon/deploy@71f273c]: updated trove -> codfw1dev (duration: 01m 24s) |
[production] |
15:08 |
<andrew@deploy1002> |
Started deploy [horizon/deploy@71f273c]: updated trove -> codfw1dev |
[production] |
15:03 |
<andrew@deploy1002> |
Finished deploy [horizon/deploy@71f273c]: updated trove -> codfw1dev (duration: 01m 11s) |
[production] |
15:02 |
<andrew@deploy1002> |
Started deploy [horizon/deploy@71f273c]: updated trove -> codfw1dev |
[production] |
15:02 |
<andrew@deploy1002> |
Finished deploy [horizon/deploy@71f273c]: updated trove -> codfw1dev (duration: 01m 26s) |
[production] |
15:00 |
<andrew@deploy1002> |
Started deploy [horizon/deploy@71f273c]: updated trove -> codfw1dev |
[production] |
15:00 |
<andrew@deploy1002> |
Finished deploy [horizon/deploy@71f273c]: updated trove -> codfw1dev (duration: 01m 29s) |
[production] |
14:58 |
<andrew@deploy1002> |
Started deploy [horizon/deploy@71f273c]: updated trove -> codfw1dev |
[production] |
14:57 |
<andrew@deploy1002> |
Finished deploy [horizon/deploy@71f273c]: updated trove -> codfw1dev (duration: 01m 22s) |
[production] |
14:56 |
<andrew@deploy1002> |
Started deploy [horizon/deploy@71f273c]: updated trove -> codfw1dev |
[production] |
14:41 |
<bblack@cumin1001> |
conftool action : set/pooled=yes; selector: name=cp203[34].codfw.wmnet |
[production] |
14:40 |
<andrew@deploy1002> |
Finished deploy [horizon/deploy@71f273c]: updated trove -> codfw1dev (duration: 01m 19s) |
[production] |
14:38 |
<andrew@deploy1002> |
Started deploy [horizon/deploy@71f273c]: updated trove -> codfw1dev |
[production] |
14:38 |
<andrew@deploy1002> |
Finished deploy [horizon/deploy@71f273c]: updated trove -> codfw1dev (duration: 00m 50s) |
[production] |
14:37 |
<andrew@deploy1002> |
Started deploy [horizon/deploy@71f273c]: updated trove -> codfw1dev |
[production] |