7751-7800 of 10000 results (32ms)
2020-06-08 §
10:39 <XioNoX> depool codfw - T243080 [production]
09:46 <moritzm> installing gnutls28 security updates on buster (older releases not affected) [production]
09:32 <qchris> Turning on puppet on gerrit1002 again to avoid starting to lag too far behind [production]
08:17 <XioNoX> push T250136 to eqsin - T250136 [production]
08:09 <XioNoX> push T250136 to eqiad - T250136 [production]
08:07 <moritzm> upgrading mw1349-mw1383 to PHP 7.2.31 [production]
08:07 <mutante> stat1006 moved broken jupyter-dedcode-singleuser.service out of /run/systemd/transient. systemctl reset-failed [production]
08:02 <XioNoX> push T250136 to codfw - T250136 [production]
07:58 <XioNoX> push T250136 to eqord/eqdfw - T250136 [production]
07:58 <mutante> stat1006 bash[40607]: /bin/bash: line 0: exec: jupyterhub-singleuser: not found [production]
07:57 <mutante> ran puppet on all stat* hosts for an access request (dcipoletti was added) - stat1006 systemd state broke right after, jupyter-dedcode-singleuser.service failed [production]
07:46 <XioNoX> push T250136 to esams/knams - T250136 [production]
07:42 <XioNoX> cr4-ulsfo protocols bgp group Transit4 family inet any -> unicast - T250136 [production]
07:39 <XioNoX> cr3-ulsfo protocols bgp group Transit4 family inet any -> unicast - T250136 [production]
07:37 <moritzm> installing nodejs security updates [production]
07:05 <marostegui> Stop MySQL on labsdb1012 to clone labsdb1011 T249188 [production]
05:22 <marostegui> Upgrade db1077 to 10.4.13 to test events memory leak [production]
04:45 <_joe_> de-firewalling mc1029 [production]
04:27 <_joe_> firewallingf off memcached on mc1029 [production]
2020-06-05 §
16:45 <elukey@deploy1001> Finished deploy [analytics/turnilo/deploy@f7e4f78]: Upgrade to 1.24.0 (duration: 00m 11s) [production]
16:45 <elukey@deploy1001> Started deploy [analytics/turnilo/deploy@f7e4f78]: Upgrade to 1.24.0 [production]
16:29 <bd808> Testing stashbot following hard restart of service. It was having LDAP connection failure problems. [production]
16:00 <AndyRussG> Turned off Fundraising job recurring_smashpig_charge [production]
15:54 <cdanis> enabling & rerunning puppet on netflow* T254574 [production]
15:39 <cdanis> disabling puppet on netflow* and trying I6598d8f8 on netflow3001 first T254574 [production]
15:39 <cdanis> disabling puppet on netflow* and trying I6598d8f8 on netflow3001 first [production]
13:33 <jayme@deploy1001> helmfile [STAGING] Ran 'sync' command on namespace 'mathoid' for release 'staging' . [production]
13:19 <akosiaris@deploy1001> helmfile [STAGING] Ran 'sync' command on namespace 'cxserver' for release 'staging' . [production]
13:19 <elukey@cumin1001> END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) [production]
13:19 <akosiaris@deploy1001> helmfile [STAGING] Ran 'sync' command on namespace 'citoid' for release 'staging' . [production]
13:18 <akosiaris@deploy1001> helmfile [STAGING] Ran 'sync' command on namespace 'changeprop-jobqueue' for release 'staging' . [production]
13:15 <elukey@cumin1001> START - Cookbook sre.hosts.downtime [production]
12:55 <ladsgroup@deploy1001> Synchronized wmf-config/interwiki.php: Hotfix for be-tarask interwiki link being broken (T111853) (duration: 01m 00s) [production]
12:41 <mutante> rebooting gerrit1002 to add more vCPUs, after [ganeti1009:~] $ sudo gnt-instance modify -B vcpus=8 gerrit1002.wikimedia.org T239151 [production]
12:20 <akosiaris@deploy1001> helmfile [STAGING] Ran 'sync' command on namespace 'zotero' for release 'staging' . [production]
12:19 <akosiaris@deploy1001> helmfile [STAGING] Ran 'sync' command on namespace 'wikifeeds' for release 'staging' . [production]
12:19 <akosiaris@deploy1001> helmfile [STAGING] Ran 'sync' command on namespace 'cxserver' for release 'staging' . [production]
12:19 <akosiaris@deploy1001> helmfile [STAGING] Ran 'sync' command on namespace 'citoid' for release 'staging' . [production]
12:19 <akosiaris@deploy1001> helmfile [STAGING] Ran 'sync' command on namespace 'changeprop-jobqueue' for release 'staging' . [production]
12:17 <akosiaris> update blubberoid changeprop changeprop-jobqueue citoid cxserver wikifeeds zotero in staging to latest charts [production]
12:17 <akosiaris@deploy1001> helmfile [STAGING] Ran 'sync' command on namespace 'changeprop' for release 'staging' . [production]
12:17 <akosiaris@deploy1001> helmfile [STAGING] Ran 'sync' command on namespace 'blubberoid' for release 'staging' . [production]
12:17 <akosiaris> fix typo in ganeti2016 /etc/network/interfaces and reboot [production]
11:28 <akosiaris> master-failover from ganeti2001 to ganeti2019 for ganeti01.svc.codfw.wmnet [production]
11:25 <akosiaris@deploy1001> helmfile [EQIAD] Ran 'sync' command on namespace 'kube-system' for release 'calico-policy-controller' . [production]
11:25 <akosiaris@deploy1001> helmfile [CODFW] Ran 'sync' command on namespace 'kube-system' for release 'calico-policy-controller' . [production]
11:25 <akosiaris@deploy1001> helmfile [STAGING] Ran 'sync' command on namespace 'kube-system' for release 'calico-policy-controller' . [production]
11:14 <mutante> running puppet on all ganeti nodes [production]
11:05 <ladsgroup@deploy1001> Synchronized wmf-config/interwiki.php: Update interwiki cache (duration: 02m 14s) [production]
10:32 <elukey@cumin1001> END (PASS) - Cookbook sre.cassandra.roll-restart (exit_code=0) [production]