2020-05-28
§
|
19:17 |
<milimetric@deploy1001> |
Finished deploy [analytics/refinery@f6d73c8]: Hotfix #2 today: forgot jars [analytics/refinery@f6d73c8] (duration: 16m 54s) |
[production] |
19:14 |
<twentyafterfour@deploy1001> |
rebuilt and synchronized wikiversions files: all wikis to 1.35.0-wmf.34 refs T253022 |
[production] |
19:01 |
<shdubsh> |
restart varnishmtail and atsmtail on cp5001.eqsin.wmnet |
[production] |
19:00 |
<milimetric@deploy1001> |
Started deploy [analytics/refinery@f6d73c8]: Hotfix #2 today: forgot jars [analytics/refinery@f6d73c8] |
[production] |
17:03 |
<twentyafterfour@deploy1001> |
Synchronized php: group1 wikis to 1.35.0-wmf.34 refs T253022 (duration: 01m 06s) |
[production] |
17:02 |
<twentyafterfour@deploy1001> |
rebuilt and synchronized wikiversions files: group1 wikis to 1.35.0-wmf.34 refs T253022 |
[production] |
16:32 |
<jforrester@deploy1001> |
Synchronized php-1.35.0-wmf.34/extensions/Wikibase: T253804 Use ThrowingEntityTermStoreWriter when writers shouldn't be called (duration: 01m 15s) |
[production] |
15:37 |
<milimetric@deploy1001> |
Finished deploy [analytics/refinery@203d182] (thin): Three hotfixes (THIN) [analytics/refinery@203d182] (duration: 00m 10s) |
[production] |
15:37 |
<milimetric@deploy1001> |
Started deploy [analytics/refinery@203d182] (thin): Three hotfixes (THIN) [analytics/refinery@203d182] |
[production] |
15:05 |
<milimetric@deploy1001> |
Finished deploy [analytics/refinery@203d182]: Three hotfixes [analytics/refinery@203d182] (duration: 25m 59s) |
[production] |
15:02 |
<moritzm> |
installing exim4 security updates on jessie (stretch/buster already fixed) |
[production] |
14:39 |
<milimetric@deploy1001> |
Started deploy [analytics/refinery@203d182]: Three hotfixes [analytics/refinery@203d182] |
[production] |
14:33 |
<andrew@cumin1001> |
END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) |
[production] |
14:30 |
<andrew@cumin1001> |
START - Cookbook sre.hosts.downtime |
[production] |
14:01 |
<ema> |
atskafka 0.8 uploaded to buster-wikimedia T253551 |
[production] |
13:49 |
<godog> |
roll-restart prometheus k8s-staging to enable thanos upload - T252186 |
[production] |
13:36 |
<hashar> |
Restarting CI Jenkins for plugin rollback |
[production] |
11:49 |
<moritzm> |
installing unbound security updates |
[production] |
11:03 |
<kormat@cumin1001> |
dbctl commit (dc=all): 'Add db2138 to s2+s4 T252985', diff saved to https://phabricator.wikimedia.org/P11330 and previous config saved to /var/cache/conftool/dbconfig/20200528-110333-kormat.json |
[production] |
10:36 |
<jayme@deploy1001> |
helmfile [EQIAD] Ran 'sync' command on namespace 'blubberoid' for release 'production' . |
[production] |
10:34 |
<jayme@deploy1001> |
helmfile [CODFW] Ran 'sync' command on namespace 'blubberoid' for release 'production' . |
[production] |
10:30 |
<jayme@deploy1001> |
helmfile [STAGING] Ran 'sync' command on namespace 'blubberoid' for release 'staging' . |
[production] |
10:02 |
<mutante> |
gerrit1002 (test server) - chown -R gerrit2:gerrit2 /var/lib/gerrit/review_site ; restarted gerrit service, now the service is not in restart loop anymore, gerrit-ssh is listening too, just not accepting publickey (T239151) |
[production] |
09:51 |
<XioNoX> |
failover VRRP in ulsfo |
[production] |
09:41 |
<XioNoX> |
re-activate peering/transit on cr2-eqdfw - T243080 |
[production] |
09:35 |
<mutante> |
restarting gerrit on gerrit1002 after fixing db_pass to the readonly one (T243800) |
[production] |
09:33 |
<XioNoX> |
restart cr2-eqdfw for upgrade - T243080 |
[production] |
09:30 |
<XioNoX> |
deactivate peering/transit on cr2-eqdfw - T243080 |
[production] |
09:25 |
<_joe_> |
updating ACLs on all etcd servers |
[production] |
09:22 |
<XioNoX> |
install new Junos on cr2-eqdfw - T243080 |
[production] |
09:16 |
<XioNoX> |
rollback cr2-eqord ospf/bgp - T243080 |
[production] |
09:07 |
<XioNoX> |
restart cr2-eqord for upgrade - T243080 |
[production] |
09:05 |
<jayme@deploy1001> |
helmfile [STAGING] Ran 'sync' command on namespace 'blubberoid' for release 'staging' . |
[production] |
08:50 |
<_joe_> |
upgrading etcd ACLs (adding new users) to conf1004 |
[production] |
08:50 |
<XioNoX> |
install new Junos on cr2-eqord - T243080 |
[production] |
08:46 |
<XioNoX> |
deactivate peering/transit on cr2-eqord - T243080 |
[production] |
08:45 |
<XioNoX> |
de-pref all OSPF links to cr2-eqord - T243080 |
[production] |
08:13 |
<marostegui> |
Pool db1141 into labsdb analytics role - T249188 |
[production] |
07:33 |
<gilles@deploy1001> |
Synchronized static/images: T252108 Deploying optimised static PNGs (duration: 01m 39s) |
[production] |
07:31 |
<gilles@deploy1001> |
Synchronized static/apple-touch: T252108 Deploying optimised static PNGs (duration: 01m 12s) |
[production] |
06:30 |
<marostegui@cumin1001> |
dbctl commit (dc=all): 'Remove db1081 from API and set its weight to 0 on main traffic - preparation for tomorrow's failover T253808', diff saved to https://phabricator.wikimedia.org/P11329 and previous config saved to /var/cache/conftool/dbconfig/20200528-063037-marostegui.json |
[production] |
04:44 |
<marostegui> |
Run check_private data on db1141 - T249188 |
[production] |
04:22 |
<marostegui> |
Stop MySQL on db1141 - T249188 |
[production] |