2017-05-02
§
|
09:58 |
<END> |
(PASS) - Resync the redis for jobqueues in eqiad with the masters in codfw - t04_resync_redis (switchdc/oblivian@neodymium) |
[production] |
09:56 |
<START> |
- Resync the redis for jobqueues in eqiad with the masters in codfw - t04_resync_redis (switchdc/oblivian@neodymium) |
[production] |
09:55 |
<_joe_> |
testing pre-switchover the step to restart & resync redises in dc_to (eqiad) |
[production] |
09:48 |
<jynus@naos> |
Synchronized wmf-config/db-codfw.php: Add db1097 (duration: 01m 00s) |
[production] |
09:47 |
<jynus@naos> |
Synchronized wmf-config/db-eqiad.php: Depool db1015 & add db1097 (duration: 01m 17s) |
[production] |
09:36 |
<hashar> |
Jenkins/CI is back up! |
[production] |
09:34 |
<hashar> |
Nodepool can not add instances to Jenkins any more. Roll backing Jenkins to 2.32.3 |
[production] |
09:29 |
<akosiaris> |
Set description for ganeti2005, ganeti2006 on asw-a-codfw. T164011 |
[production] |
09:27 |
<akosiaris> |
create interface range ganeti on asw-a-codfw. T164011 |
[production] |
09:24 |
<akosiaris> |
remove configuration from ge-8/0/0, ge-8/0/3 from asw-b-codfw for ganeti2005, ganeti2006 move to row A. T164011 |
[production] |
09:21 |
<hashar> |
Starting Nodepool |
[production] |
09:16 |
<hashar> |
Stopping Nodepool |
[production] |
09:14 |
<hashar> |
OpenStack / wmflabs fails to create new instances |
[production] |
08:40 |
<hashar> |
Upgrading Jenkins to 2.46.2 - T144106 |
[production] |
08:40 |
<elukey> |
run puppet and restart nutcracker on eqiad hosts with profile::mediawiki::nutcracker |
[production] |
08:33 |
<hashar> |
Upgrading Jenkins to 2.32.3 - T144106 |
[production] |
08:32 |
<elukey> |
stop and mask redis on mc1001-mc1018 - T137345 |
[production] |
08:26 |
<hashar> |
Upgrading Jenkins to 2.19.4 - T144106 |
[production] |
08:14 |
<hashar> |
Installing Jenkins Pipeline plugin |
[production] |
08:04 |
<hashar> |
Installing Jenkins plugin Pipeline: Stage View https://plugins.jenkins.io/pipeline-stage-view |
[production] |
08:04 |
<hashar> |
Upgrading Jenkins to 2.7.4 - T144106 |
[production] |
07:59 |
<elukey> |
Swap mc1001->mc1012 with mc1019->mc2030 - T137345 (more informative :) |
[production] |
07:58 |
<elukey> |
wap mc1001->mc1012 with mc1019->mc2030 |
[production] |
07:36 |
<_joe_> |
starting etcd replication codfw => eqiad |
[production] |
06:46 |
<_joe_> |
disabling etcd auth on conf1*, converting to use nginx for TLS/auth T159687 |
[production] |
03:10 |
<mattflaschen@naos> |
Synchronized php-1.29.0-wmf.21/extensions/FlaggedRevs/: Urgent deploy: Fix FlaggedRevs fatal, and also a filter issue: T164096 and T164049 (duration: 00m 56s) |
[production] |
02:45 |
<tstarling@naos> |
Synchronized php-1.29.0-wmf.21/includes/config/EtcdConfig.php: EtcdConfig backported bug fixes (duration: 01m 02s) |
[production] |
02:34 |
<tstarling@naos> |
Synchronized wmf-config/CommonSettings.php: siteinfo hook (duration: 02m 39s) |
[production] |
00:33 |
<tstarling@puppetmaster1001> |
conftool action : set/@read-write.yaml; selector: name=ReadOnly |
[production] |
00:33 |
<tstarling@puppetmaster1001> |
conftool action : set/@dc-codfw.yaml; selector: name=WMFMasterDatacenter |
[production] |
00:25 |
<TimStarling> |
populating production etcd with initial mediawiki config keys |
[production] |
2017-05-01
§
|
23:41 |
<mutante> |
netmon1002 - signed puppet cert, initial puppet run, accept salt-key,.. (T159756) |
[production] |
23:15 |
<mutante> |
netmon1002 - boot into PXE, initial OS install (T159756) |
[production] |
23:06 |
<bd808> |
Ran puppet cert clean striker-deploy03.striker.eqiad.wmflabs on labcontrol1001 |
[production] |
19:43 |
<ejegg> |
updated payments-wiki from 4c5630283c57efbc454cc70d47218f7f22ea252a to 57451dee67e498d445a6f9bc10d40acf3df65f38 |
[production] |
19:10 |
<mobrovac@naos> |
Finished deploy [mobileapps/deploy@b5afcb8]: Forced deploy to bring the targets to the current version (duration: 02m 08s) |
[production] |
19:08 |
<mobrovac@naos> |
Started deploy [mobileapps/deploy@b5afcb8]: Forced deploy to bring the targets to the current version |
[production] |
18:46 |
<mutante> |
temp. re-enabling puppet on restbase1018 and running it once to fix icinga config syntax error. then disabling it again. restbase service stopped before and after. this box has a broken disk. |
[production] |
18:35 |
<mutante> |
brought mc1018 back up, ran puppet on it and then on Icinga. parent was adjusted from asw-d-eqiad to asw2-2-eqiad. reduced icinga config errors by 50% :p (1 of 2 left, restbase1018) |
[production] |
18:28 |
<mutante> |
powercycling mc1018 |
[production] |
18:19 |
<mutante> |
manually removed asw-d-eqiad remnants from /etc/icinga/puppet_hosts.cfg to fix icinga config after gerrit:351167 / T148506. fixes Icinga config error. then puppet adds it back |
[production] |
18:03 |
<andrewbogott> |
restarting nova-fullstack tests but saving instance 2d60e8c5-fb2a-4681-ac0a-ae2162bb13fb for future research |
[production] |
17:03 |
<mutante> |
phab2001 - start/stop phd service - that fixed "systemd state" icinga check, even though phd does not run just like before |
[production] |
16:53 |
<bblack> |
reverting inter-caching routing from codfw-switchover period: https://wikitech.wikimedia.org/wiki/Switch_Datacenter#Switchback |
[production] |
16:52 |
<bblack@neodymium> |
conftool action : set/pooled=yes; selector: dc=eqiad,cluster=cache_upload,name=cp107[1234].eqiad.wmnet |
[production] |
16:19 |
<mobrovac@naos> |
Finished deploy [citoid/deploy@747777f]: Remove mwDeprecated - T93514 (duration: 02m 19s) |
[production] |
16:17 |
<mobrovac@naos> |
Started deploy [citoid/deploy@747777f]: Remove mwDeprecated - T93514 |
[production] |
15:46 |
<jynus> |
shutting down db1063 for maintenance T164107 |
[production] |
15:13 |
<bblack> |
restarting varnish backend on cp2002 (mailbox issues) |
[production] |
12:58 |
<Amir1> |
cleaning ores_classification rows half an hour or so (T159753) |
[production] |