2017-05-02
§
|
07:36 |
<_joe_> |
starting etcd replication codfw => eqiad |
[production] |
06:46 |
<_joe_> |
disabling etcd auth on conf1*, converting to use nginx for TLS/auth T159687 |
[production] |
03:10 |
<mattflaschen@naos> |
Synchronized php-1.29.0-wmf.21/extensions/FlaggedRevs/: Urgent deploy: Fix FlaggedRevs fatal, and also a filter issue: T164096 and T164049 (duration: 00m 56s) |
[production] |
02:45 |
<tstarling@naos> |
Synchronized php-1.29.0-wmf.21/includes/config/EtcdConfig.php: EtcdConfig backported bug fixes (duration: 01m 02s) |
[production] |
02:34 |
<tstarling@naos> |
Synchronized wmf-config/CommonSettings.php: siteinfo hook (duration: 02m 39s) |
[production] |
00:33 |
<tstarling@puppetmaster1001> |
conftool action : set/@read-write.yaml; selector: name=ReadOnly |
[production] |
00:33 |
<tstarling@puppetmaster1001> |
conftool action : set/@dc-codfw.yaml; selector: name=WMFMasterDatacenter |
[production] |
00:25 |
<TimStarling> |
populating production etcd with initial mediawiki config keys |
[production] |
2017-05-01
§
|
23:41 |
<mutante> |
netmon1002 - signed puppet cert, initial puppet run, accept salt-key,.. (T159756) |
[production] |
23:15 |
<mutante> |
netmon1002 - boot into PXE, initial OS install (T159756) |
[production] |
23:06 |
<bd808> |
Ran puppet cert clean striker-deploy03.striker.eqiad.wmflabs on labcontrol1001 |
[production] |
19:43 |
<ejegg> |
updated payments-wiki from 4c5630283c57efbc454cc70d47218f7f22ea252a to 57451dee67e498d445a6f9bc10d40acf3df65f38 |
[production] |
19:10 |
<mobrovac@naos> |
Finished deploy [mobileapps/deploy@b5afcb8]: Forced deploy to bring the targets to the current version (duration: 02m 08s) |
[production] |
19:08 |
<mobrovac@naos> |
Started deploy [mobileapps/deploy@b5afcb8]: Forced deploy to bring the targets to the current version |
[production] |
18:46 |
<mutante> |
temp. re-enabling puppet on restbase1018 and running it once to fix icinga config syntax error. then disabling it again. restbase service stopped before and after. this box has a broken disk. |
[production] |
18:35 |
<mutante> |
brought mc1018 back up, ran puppet on it and then on Icinga. parent was adjusted from asw-d-eqiad to asw2-2-eqiad. reduced icinga config errors by 50% :p (1 of 2 left, restbase1018) |
[production] |
18:28 |
<mutante> |
powercycling mc1018 |
[production] |
18:19 |
<mutante> |
manually removed asw-d-eqiad remnants from /etc/icinga/puppet_hosts.cfg to fix icinga config after gerrit:351167 / T148506. fixes Icinga config error. then puppet adds it back |
[production] |
18:03 |
<andrewbogott> |
restarting nova-fullstack tests but saving instance 2d60e8c5-fb2a-4681-ac0a-ae2162bb13fb for future research |
[production] |
17:03 |
<mutante> |
phab2001 - start/stop phd service - that fixed "systemd state" icinga check, even though phd does not run just like before |
[production] |
16:53 |
<bblack> |
reverting inter-caching routing from codfw-switchover period: https://wikitech.wikimedia.org/wiki/Switch_Datacenter#Switchback |
[production] |
16:52 |
<bblack@neodymium> |
conftool action : set/pooled=yes; selector: dc=eqiad,cluster=cache_upload,name=cp107[1234].eqiad.wmnet |
[production] |
16:19 |
<mobrovac@naos> |
Finished deploy [citoid/deploy@747777f]: Remove mwDeprecated - T93514 (duration: 02m 19s) |
[production] |
16:17 |
<mobrovac@naos> |
Started deploy [citoid/deploy@747777f]: Remove mwDeprecated - T93514 |
[production] |
15:46 |
<jynus> |
shutting down db1063 for maintenance T164107 |
[production] |
15:13 |
<bblack> |
restarting varnish backend on cp2002 (mailbox issues) |
[production] |
12:58 |
<Amir1> |
cleaning ores_classification rows half an hour or so (T159753) |
[production] |
11:31 |
<jynus> |
running alter table on categorylinks on db1054, 68, 62 T164185 |
[production] |
11:25 |
<jynus> |
running alter table on enwiki.categorylinks on db1052 T164185 |
[production] |
03:46 |
<tstarling@naos> |
Synchronized wmf-config: https://gerrit.wikimedia.org/r/#/c/347537/ (duration: 01m 01s) |
[production] |
03:44 |
<tstarling@naos> |
Synchronized wmf-config/etcd.php: https://gerrit.wikimedia.org/r/#/c/347537/ (duration: 02m 39s) |
[production] |
2017-04-28
§
|
21:24 |
<Dereckson> |
End of live debug on mwdebug1001, restored previous state with a local scap pull |
[production] |
21:00 |
<ejegg> |
updated payments-wiki from 1620b8233321099262ff4333a2269f0563107e66 to 4c5630283c57efbc454cc70d47218f7f22ea252a |
[production] |
20:23 |
<Dereckson> |
Live debug on mwdebug1001 for T164059 |
[production] |
19:30 |
<jynus> |
shutting down db1063 - I see high temperatures reported, and going up T164107 |
[production] |
19:08 |
<urandom> |
T163936: reenabling puppet on restbase-dev1001 |
[production] |
18:14 |
<urandom> |
T163936: disabling puppet on restbase-dev1001 (t-shooting c-m-c) |
[production] |
17:08 |
<jynus> |
restarting replication on all nodes on s7-eqiad T164092 |
[production] |
16:38 |
<jynus> |
stopping replication on all nodes on s7-eqiad in case db1062 boots up in a corrupted state |
[production] |
16:36 |
<jynus> |
restarting db1062 once more T164092 |
[production] |
15:56 |
<godog> |
poweroff prometheus1004 for ram upgrade - T163385 |
[production] |