2020-04-17
§
|
19:33 |
<Krinkle> |
Depool mw1407.eqiad.wmnet for opcache testing. Do not repool without first reverting https://gerrit.wikimedia.org/r/589674. |
[production] |
19:32 |
<Krinkle> |
Depool mw1407.eqiad.wmnet for opcache and LCStoreStaticArray testing. – T99740 |
[production] |
17:41 |
<cmjohnson1> |
replacing network cable pc1009 T250257 |
[production] |
17:34 |
<cmjohnson1> |
moving msw1 to msw-c racks mounted switch cable ports from port 49 to port 50 |
[production] |
17:22 |
<jmm@cumin2001> |
END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) |
[production] |
17:22 |
<jmm@cumin2001> |
START - Cookbook sre.hosts.downtime |
[production] |
16:15 |
<Urbanecm> |
Revert recent email change of User:CPHL@SUL's email |
[production] |
16:05 |
<otto@deploy1001> |
helmfile [STAGING] Ran 'apply' command on namespace 'eventstreams' for release 'canary' . |
[production] |
16:05 |
<otto@deploy1001> |
helmfile [STAGING] Ran 'apply' command on namespace 'eventstreams' for release 'production' . |
[production] |
15:52 |
<otto@deploy1001> |
helmfile [STAGING] Ran 'apply' command on namespace 'eventgate-main' for release 'canary' . |
[production] |
15:52 |
<otto@deploy1001> |
helmfile [STAGING] Ran 'apply' command on namespace 'eventgate-main' for release 'production' . |
[production] |
15:48 |
<otto@deploy1001> |
helmfile [STAGING] Ran 'apply' command on namespace 'eventgate-analytics' for release 'canary' . |
[production] |
15:48 |
<otto@deploy1001> |
helmfile [STAGING] Ran 'apply' command on namespace 'eventgate-analytics' for release 'production' . |
[production] |
15:42 |
<otto@deploy1001> |
helmfile [STAGING] Ran 'apply' command on namespace 'eventgate-analytics-external' for release 'canary' . |
[production] |
15:41 |
<otto@deploy1001> |
helmfile [STAGING] Ran 'apply' command on namespace 'eventgate-analytics-external' for release 'production' . |
[production] |
15:20 |
<rzl> |
remove cronjobs from mwmaint1002 previously updated to systemd timers and erroneously left in crontab -- diffs: https://phabricator.wikimedia.org/P11012 T211250 |
[production] |
14:29 |
<mutante> |
ganeti2001 - kileld and restarted gnt-rapi process with the correct new key and cert |
[production] |
14:19 |
<cdanis> |
add peer AS29802 to cr2-eqdfw and cr2-esams |
[production] |
14:01 |
<mutante> |
netbox1001 - netbox_ganeti_eqiad_synx / systemd state fixed after gnt-rapi is runnign again on ganeti1003 |
[production] |
14:00 |
<mutante> |
ganeti1003 - fixing gnt-rapi daemon not running |
[production] |
13:54 |
<mateusbs17> |
Running VACUUM FULL for gis DB in maps2004.codfw.wmnet (which is depooled at the moment) |
[production] |
13:00 |
<mutante> |
netbox1001 - sudo systemctl start netbox_ganeti_eqiad_sync (was failed) |
[production] |
12:54 |
<mutante> |
contint2001 /usr/local/sbin/build-envoy-config -c /etc/envoy ; restart envoyproxy; was not listening on admin port |
[production] |
12:45 |
<mutante> |
cntint2001 - restart nagios-nrpe-server |
[production] |
12:28 |
<moritzm> |
copied kubernetes-client from stretch-wikimedia to buster-wikimedia T224591 |
[production] |
11:35 |
<mutante> |
contint2001 - apt-get update, run puppet to install helm-diff |
[production] |
11:33 |
<jayme> |
imported helm-diff 2.11.0+3-2+deb10u1 to main for buster-wikimedia |
[production] |
11:23 |
<dzahn@cumin2001> |
END (FAIL) - Cookbook sre.hosts.decommission (exit_code=99) |
[production] |
11:23 |
<dzahn@cumin2001> |
START - Cookbook sre.hosts.decommission |
[production] |
11:22 |
<dzahn@cumin1001> |
END (FAIL) - Cookbook sre.hosts.decommission (exit_code=1) |
[production] |
11:21 |
<dzahn@cumin1001> |
START - Cookbook sre.hosts.decommission |
[production] |
11:20 |
<dzahn@cumin1001> |
END (FAIL) - Cookbook sre.hosts.decommission (exit_code=1) |
[production] |
11:20 |
<dzahn@cumin1001> |
START - Cookbook sre.hosts.decommission |
[production] |
10:17 |
<_joe_> |
contint1001:~$ sudo systemctl restart envoyproxy.service |
[production] |
10:16 |
<_joe_> |
contint1001:~$ sudo /usr/local/sbin/build-envoy-config -c /etc/envoy |
[production] |
10:07 |
<kormat> |
change pc2010 to replicate from pc1010 T247787 |
[production] |
09:54 |
<kormat> |
enabling replication from pc1007 to pc1010 T247787 |
[production] |
09:20 |
<jayme> |
imported helm 2.12.2 to main for buster-wikimedia |
[production] |
09:07 |
<vgutierrez> |
disable KA between ats-tls and varnish-fe on cp1077 - T248938 |
[production] |
09:00 |
<kormat> |
dropping wikidatawiki.wb_items_per_site_old table in eqiad (non-labs hosts) T250345 |
[production] |
08:15 |
<kormat> |
dropping wikidatawiki.wb_items_per_site_old table in codfw T250345 |
[production] |
07:54 |
<ema> |
cache_text: puppet run to stop vhtcpd and start purged T249325 |
[production] |
07:45 |
<gehel> |
restart wdqs-updater on all nodes after deployment |
[production] |
06:31 |
<marostegui@cumin1001> |
dbctl commit (dc=all): 'Fully repool db1092 after compression', diff saved to https://phabricator.wikimedia.org/P11005 and previous config saved to /var/cache/conftool/dbconfig/20200417-063138-marostegui.json |
[production] |
06:30 |
<marostegui@cumin1001> |
dbctl commit (dc=all): 'Remove db1111 from API', diff saved to https://phabricator.wikimedia.org/P11004 and previous config saved to /var/cache/conftool/dbconfig/20200417-063038-marostegui.json |
[production] |
06:26 |
<marostegui@cumin1001> |
dbctl commit (dc=all): 'Slowly repool db1092 after compression', diff saved to https://phabricator.wikimedia.org/P11003 and previous config saved to /var/cache/conftool/dbconfig/20200417-062642-marostegui.json |
[production] |