2014-07-30
§
|
16:00 |
<bd808> |
Puppet runs on videoscaler01 and jobrunner01 failing for "Could not find dependency Ferm::Rule[bastion-ssh] for Ferm::Rule[deployment-bastion-scap-ssh]" |
[releng] |
16:00 |
<bd808> |
Puppet seems manually disabled on apache0[12]. |
[releng] |
15:59 |
<bd808> |
Can't ssh to apache0[12], videoscaler01 and jobrunner01. Puppet not running on any of them. libnss-ldapd unattended update has broken /etc/nslcd.conf |
[releng] |
15:23 |
<bd808> |
Removed cherry-pick for Iac547efa83cf059a1276b6e279c3ebd4c7224b2c and updated cherry-pick for I5afba2c6b0fbf90ff8495cc4a82f5c7851893b52 to latest patch set. |
[releng] |
15:05 |
<bd808> |
Two cherry-picks in puppet conflicting with merged production changes: I5afba2c6b0fbf90ff8495cc4a82f5c7851893b52 and Iac547efa83cf059a1276b6e279c3ebd4c7224b2c (ori, twentyafterfour) |
[releng] |
14:49 |
<bd808> |
Started apache2 service on deployment-mediawiki01 |
[releng] |
14:16 |
<hashar> |
rebooting hhvm |
[releng] |
09:42 |
<hashar> |
bastion had broken puppet because deployment_server and zuul both declare the same python packages {{gerrit|150501}} |
[releng] |
09:40 |
<hashar> |
restoring on puppetmaster modules/mediawiki/templates/apache/apache2.conf.erb which got deleted somehow |
[releng] |
09:29 |
<hashar> |
Rebooting apache01/02 to see whether it fix the ssh connection issue |
[releng] |
09:27 |
<hashar> |
manually started hhvm on mediawiki01 |
[releng] |
09:25 |
<hashar> |
rebooting deployment-mediawiki01 hhvm process went zombie |
[releng] |
09:23 |
<hashar> |
restarting hhvm on mediawiki 01/02 |
[releng] |
09:05 |
<hashar_> |
Beta scap script broken since 6:30am UTC https://integration.wikimedia.org/ci/job/beta-scap-eqiad/ |
[releng] |
2014-07-29
§
|
22:56 |
<cscott> |
updated OCG to version aeb8623d6ebe41ae7c7e36c57844bd9ea8e6d595 |
[releng] |
21:02 |
<bd808> |
Converted deployment-sentry2.eqiad.wmflabs to use beta salt/puppet master |
[releng] |
19:14 |
<hashar> |
Removed all jobs from queue, restarted slave agent. Update Jobs coming back |
[releng] |
19:09 |
<hashar> |
deployment-bastion jenkins slave is stuck. Beta cluster is no more updating code :-// |
[releng] |
15:58 |
<godog> |
restarted hhvm on deploymnet-mediawiki01 |
[releng] |
15:52 |
<godog> |
restarted hhvm on deployment-mediawiki02 |
[releng] |
15:50 |
<godog> |
installed libevent-dbg on deployment-mediawiki02 to capture an hhvm backtrace |
[releng] |
15:17 |
<bd808> |
_joe_ restarting hhvm on deployment-mediawiki01 |
[releng] |
15:00 |
<bd808> |
Apache stuck with 65 children on both deployment-mediawiki servers |
[releng] |
10:37 |
<hashar> |
Restarted hhvm on mediawiki{01,02} |
[releng] |
2014-07-28
§
|
17:41 |
<bd808> |
Updated hhvm to latest 3.3-dev+20140728 build on deployment-mediawiki0[12] |
[releng] |
15:37 |
<manybubbles> |
rebuilding elasticsearch indexes to build a weighted all field we'll try to use to improve performance |
[releng] |
15:32 |
<bd808> |
Restarted hhvm on deployment-mediawiki0[12]. All apache children were stuck waiting for hhvm to respond. |
[releng] |
15:20 |
<bd808> |
Restarted apache on deployment-mediawiki02. 65 children and non-responsive to requests. (same as mediawiki01) |
[releng] |
15:18 |
<bd808> |
Restarted apache on deployment-mediawiki01. 65 children and non-responsive to requests. |
[releng] |
14:23 |
<manybubbles> |
or not - looks like I can't! |
[releng] |
14:22 |
<manybubbles> |
reubilding cirrus search indexes to pick up a speed up all field |
[releng] |
08:30 |
<hashar> |
restarted varnish on deployment-cache-bits01 . Hoping to clear bits cache |
[releng] |