2051-2100 of 10000 results (50ms)
2017-05-02 §
10:20 <elukey> restart ocg on ocg1002 (localhost:8000 - frontend - not reachable) [production]
10:12 <hashar> Upgrading Jenkins to 2.46.1 - T144106 [production]
10:11 <jynus> stopping replication on db1015 [production]
09:58 <END> (PASS) - Resync the redis for jobqueues in eqiad with the masters in codfw - t04_resync_redis (switchdc/oblivian@neodymium) [production]
09:56 <START> - Resync the redis for jobqueues in eqiad with the masters in codfw - t04_resync_redis (switchdc/oblivian@neodymium) [production]
09:55 <_joe_> testing pre-switchover the step to restart & resync redises in dc_to (eqiad) [production]
09:48 <jynus@naos> Synchronized wmf-config/db-codfw.php: Add db1097 (duration: 01m 00s) [production]
09:47 <jynus@naos> Synchronized wmf-config/db-eqiad.php: Depool db1015 & add db1097 (duration: 01m 17s) [production]
09:36 <hashar> Jenkins/CI is back up! [production]
09:34 <hashar> Nodepool can not add instances to Jenkins any more. Roll backing Jenkins to 2.32.3 [production]
09:29 <akosiaris> Set description for ganeti2005, ganeti2006 on asw-a-codfw. T164011 [production]
09:27 <akosiaris> create interface range ganeti on asw-a-codfw. T164011 [production]
09:24 <akosiaris> remove configuration from ge-8/0/0, ge-8/0/3 from asw-b-codfw for ganeti2005, ganeti2006 move to row A. T164011 [production]
09:21 <hashar> Starting Nodepool [production]
09:16 <hashar> Stopping Nodepool [production]
09:14 <hashar> OpenStack / wmflabs fails to create new instances [production]
08:40 <hashar> Upgrading Jenkins to 2.46.2 - T144106 [production]
08:40 <elukey> run puppet and restart nutcracker on eqiad hosts with profile::mediawiki::nutcracker [production]
08:33 <hashar> Upgrading Jenkins to 2.32.3 - T144106 [production]
08:32 <elukey> stop and mask redis on mc1001-mc1018 - T137345 [production]
08:26 <hashar> Upgrading Jenkins to 2.19.4 - T144106 [production]
08:14 <hashar> Installing Jenkins Pipeline plugin [production]
08:04 <hashar> Installing Jenkins plugin Pipeline: Stage View https://plugins.jenkins.io/pipeline-stage-view [production]
08:04 <hashar> Upgrading Jenkins to 2.7.4 - T144106 [production]
07:59 <elukey> Swap mc1001->mc1012 with mc1019->mc2030 - T137345 (more informative :) [production]
07:58 <elukey> wap mc1001->mc1012 with mc1019->mc2030 [production]
07:36 <_joe_> starting etcd replication codfw => eqiad [production]
06:46 <_joe_> disabling etcd auth on conf1*, converting to use nginx for TLS/auth T159687 [production]
03:10 <mattflaschen@naos> Synchronized php-1.29.0-wmf.21/extensions/FlaggedRevs/: Urgent deploy: Fix FlaggedRevs fatal, and also a filter issue: T164096 and T164049 (duration: 00m 56s) [production]
02:45 <tstarling@naos> Synchronized php-1.29.0-wmf.21/includes/config/EtcdConfig.php: EtcdConfig backported bug fixes (duration: 01m 02s) [production]
02:34 <tstarling@naos> Synchronized wmf-config/CommonSettings.php: siteinfo hook (duration: 02m 39s) [production]
00:33 <tstarling@puppetmaster1001> conftool action : set/@read-write.yaml; selector: name=ReadOnly [production]
00:33 <tstarling@puppetmaster1001> conftool action : set/@dc-codfw.yaml; selector: name=WMFMasterDatacenter [production]
00:25 <TimStarling> populating production etcd with initial mediawiki config keys [production]
2017-05-01 §
23:41 <mutante> netmon1002 - signed puppet cert, initial puppet run, accept salt-key,.. (T159756) [production]
23:15 <mutante> netmon1002 - boot into PXE, initial OS install (T159756) [production]
23:06 <bd808> Ran puppet cert clean striker-deploy03.striker.eqiad.wmflabs on labcontrol1001 [production]
19:43 <ejegg> updated payments-wiki from 4c5630283c57efbc454cc70d47218f7f22ea252a to 57451dee67e498d445a6f9bc10d40acf3df65f38 [production]
19:10 <mobrovac@naos> Finished deploy [mobileapps/deploy@b5afcb8]: Forced deploy to bring the targets to the current version (duration: 02m 08s) [production]
19:08 <mobrovac@naos> Started deploy [mobileapps/deploy@b5afcb8]: Forced deploy to bring the targets to the current version [production]
18:46 <mutante> temp. re-enabling puppet on restbase1018 and running it once to fix icinga config syntax error. then disabling it again. restbase service stopped before and after. this box has a broken disk. [production]
18:35 <mutante> brought mc1018 back up, ran puppet on it and then on Icinga. parent was adjusted from asw-d-eqiad to asw2-2-eqiad. reduced icinga config errors by 50% :p (1 of 2 left, restbase1018) [production]
18:28 <mutante> powercycling mc1018 [production]
18:19 <mutante> manually removed asw-d-eqiad remnants from /etc/icinga/puppet_hosts.cfg to fix icinga config after gerrit:351167 / T148506. fixes Icinga config error. then puppet adds it back [production]
18:03 <andrewbogott> restarting nova-fullstack tests but saving instance 2d60e8c5-fb2a-4681-ac0a-ae2162bb13fb for future research [production]
17:03 <mutante> phab2001 - start/stop phd service - that fixed "systemd state" icinga check, even though phd does not run just like before [production]
16:53 <bblack> reverting inter-caching routing from codfw-switchover period: https://wikitech.wikimedia.org/wiki/Switch_Datacenter#Switchback [production]
16:52 <bblack@neodymium> conftool action : set/pooled=yes; selector: dc=eqiad,cluster=cache_upload,name=cp107[1234].eqiad.wmnet [production]
16:19 <mobrovac@naos> Finished deploy [citoid/deploy@747777f]: Remove mwDeprecated - T93514 (duration: 02m 19s) [production]
16:17 <mobrovac@naos> Started deploy [citoid/deploy@747777f]: Remove mwDeprecated - T93514 [production]